[GitHub] spark issue #17926: [MINOR][SQL][PYSPARK] Allow user to specify numSlices in...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17926 Does this cause any incompatibility with existing code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17890: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17890 Thansk @dongjoon-hyun. Hi @srowen, code is updated, because the `Tigger` location is changed after your pr. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17891: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThres...

2017-05-10 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/17891 Thanks @yanboliang! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17934 **[Test build #76746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76746/testReport)** for PR 17934 at commit

[GitHub] spark issue #17900: [SPARK-20637][Core] Remove mention of old RDD classes fr...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17900 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #17904: [SPARK-20630] [Web UI] Fixed column visibility in Execut...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17904 Merged to master/2.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17933: [SPARK-20588][SQL] Cache TimeZone instances per thread.

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17933 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76732/ Test PASSed. ---

[GitHub] spark issue #17782: Reload credentials file config when app starts with chec...

2017-05-10 Thread victor-wong
Github user victor-wong commented on the issue: https://github.com/apache/spark/pull/17782 @srowen Thanks for replying. I tested with master branch and it turned out the issue still existed. I create a new PR against master branch, https://github.com/apache/spark/pull/17937.

[GitHub] spark issue #17937: Reload credentials file config when app starts with chec...

2017-05-10 Thread victor-wong
Github user victor-wong commented on the issue: https://github.com/apache/spark/pull/17937 Comments on last PR, https://github.com/apache/spark/pull/17782. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #17887: [SPARK-20399][SQL] Add a config to fallback string liter...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17887 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76735/ Test PASSed. ---

[GitHub] spark pull request #17782: Reload credentials file config when app starts wi...

2017-05-10 Thread victor-wong
Github user victor-wong closed the pull request at: https://github.com/apache/spark/pull/17782 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17887: [SPARK-20399][SQL] Add a config to fallback string liter...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17887 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be a chil...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17858 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be a chil...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17858 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76744/ Test PASSed. ---

[GitHub] spark issue #17907: SPARK-7856 Principal components and variance using compu...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17907 Yes, it is likely more accurate to not base the PCA on the Gramian. However it's probably going to be more efficient than what the SVD method does even when operating locally. If this change makes

[GitHub] spark pull request #17887: [SPARK-20399][SQL] Add a config to fallback strin...

2017-05-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17887#discussion_r115717329 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -144,7 +144,31 @@ case class Like(left:

[GitHub] spark issue #17938: [SPARK-20694][DOCS][SQL] Document DataFrameWriter partit...

2017-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17938 (I think I am not supposed to decide this and probably the best is the confirmation from a commiter) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17891: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThres...

2017-05-10 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/17891 LGTM, merged into master and branch-2.2. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17933: [SPARK-20588][SQL] Cache TimeZone instances per thread.

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17933 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, doc...

2017-05-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17934#discussion_r115695253 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -146,7 +146,7 @@ object StringIndexer extends

[GitHub] spark pull request #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, doc...

2017-05-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17934#discussion_r115695377 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -180,9 +181,10 @@ object Imputer extends DefaultParamsReadable[Imputer] {

[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-05-10 Thread ala
Github user ala commented on the issue: https://github.com/apache/spark/pull/16960 True. There's a couple of lines that should be removed with this change, that were left behind. numGeneratedRows should be gone. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #17917: [SPARK-20600][SS] KafkaRelation should be pretty ...

2017-05-10 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/17917#discussion_r115711771 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala --- @@ -143,4 +143,6 @@ private[kafka010] class

[GitHub] spark pull request #17930: [SPARK-20688][SQL] correctly check analysis for s...

2017-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17930 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17918: [SPARK-20678][SQL] Ndv for columns not in filter conditi...

2017-05-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17918 thanks, merging to master/2.2! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-05-10 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16677 @hvanhovell @cloud-fan We have seen value of this PR in our customer scenarios, and that's why we started a discussion in dev list before. And thank @viirya to discuss with us and implement it.

[GitHub] spark pull request #17891: [SPARK-20631][PYTHON][ML] LogisticRegression._che...

2017-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17891 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17918: [SPARK-20678][SQL] Ndv for columns not in filter conditi...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17918 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17918: [SPARK-20678][SQL] Ndv for columns not in filter conditi...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17918 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76731/ Test PASSed. ---

[GitHub] spark issue #17898: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17898 @jtengyp I think we won't proceed with this version, so this can be closed, but see the discussion at https://github.com/apache/spark/pull/17936 --- If your project is set up for it, you can reply

[GitHub] spark issue #17890: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17890 **[Test build #3709 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3709/testReport)** for PR 17890 at commit

[GitHub] spark pull request #17862: [SPARK-20602] [ML]Adding LBFGS as optimizer for L...

2017-05-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17862#discussion_r115698645 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -205,15 +233,21 @@ class LinearSVC @Since("2.2.0") ( val

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17770 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17770 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76734/ Test PASSed. ---

[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. in P...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #76738 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76738/testReport)** for PR 17400 at commit

[GitHub] spark pull request #17938: [DOCS][SQL] Document bucketing and partitioning i...

2017-05-10 Thread zero323
GitHub user zero323 opened a pull request: https://github.com/apache/spark/pull/17938 [DOCS][SQL] Document bucketing and partitioning in SQL guide ## What changes were proposed in this pull request? - Add Scala, Python and Java examples for `partitionBy`, `sortBy` and

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 hi @jerryshao,thanks for your review. In #17898,there is a potential buffer to cache the data,so we should control the groupsize very careful. Because for small size,it need fetch more

[GitHub] spark issue #17935: [SPARK-20690][SQL][WIP] Analyzer shouldn't add missing a...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17935 **[Test build #76745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76745/testReport)** for PR 17935 at commit

[GitHub] spark issue #17933: [SPARK-20588][SQL] Cache TimeZone instances per thread.

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17933 **[Test build #76732 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76732/testReport)** for PR 17933 at commit

[GitHub] spark issue #17890: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17890 **[Test build #3707 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3707/testReport)** for PR 17890 at commit

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76733/ Test PASSed. ---

[GitHub] spark issue #17887: [SPARK-20399][SQL] Add a config to fallback string liter...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17887 **[Test build #76735 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76735/testReport)** for PR 17887 at commit

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17930 **[Test build #76733 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76733/testReport)** for PR 17930 at commit

[GitHub] spark issue #17937: Reload credentials file config when app starts with chec...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17937 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17770 **[Test build #76734 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76734/testReport)** for PR 17770 at commit

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17930 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17934 **[Test build #76746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76746/testReport)** for PR 17934 at commit

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-05-10 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/17077 @gatorsmile #17938 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17931: [SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name ...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17931 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 Cool, you see the `iterator` operation can be divided in two cases: 1. get the block from local, this case is very good. 2. get the block from remote. - The block is cached

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17930 thanks for the review, merging to master/2.2/2.1! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17906: [SPARK-20665][SQL]"Bround" and "Round" function return N...

2017-05-10 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/17906 Please reveiw it,thanks @dongjoon-hyun @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 A cluster version of the comparison results, I will be given later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 Hi @viirya, can you help to review this? I thinks you are familiar with this, because you dad tried to solve it before. And also ping @srowen , @mridulm, @jerryshao. --- If your project

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17936 Looks like there's a similar PR #17898 trying to address this issue, can you please elaborate your difference compared to that one? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #17919: [SPARK-20677][MLLIB][ML] Follow-up to ALS recomme...

2017-05-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17919#discussion_r115692011 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -451,6 +439,8 @@ class ALSModel private[ml] ( @Since("1.6.0")

[GitHub] spark issue #17918: [SPARK-20678][SQL] Ndv for columns not in filter conditi...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17918 **[Test build #76731 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76731/testReport)** for PR 17918 at commit

[GitHub] spark pull request #17937: Reload credentials file config when app starts wi...

2017-05-10 Thread victor-wong
GitHub user victor-wong opened a pull request: https://github.com/apache/spark/pull/17937 Reload credentials file config when app starts with checkpoint file i… ## What changes were proposed in this pull request? Currently credentials file configuration is recovered from

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-10 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/17742 ** The most optimized version would be doing a quickselect on each row and select top k. ** An easy-to-implement version would be: I test both of the methods, the best performance is about 50%

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-10 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17742 It's true I think my native BLAS is not working will have to check - but yeah 1.5-2x matches what I've seen in my comparisons --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. in P...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. in P...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17400 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76740/ Test PASSed. ---

[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. in P...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17400 **[Test build #76740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76740/testReport)** for PR 17400 at commit

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17930 **[Test build #76739 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76739/testReport)** for PR 17930 at commit

[GitHub] spark issue #17935: [SPARK-20690][SQL][WIP] Analyzer shouldn't add missing a...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17935 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be a chil...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17858 **[Test build #76744 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76744/testReport)** for PR 17858 at commit

[GitHub] spark issue #17935: [SPARK-20690][SQL][WIP] Analyzer shouldn't add missing a...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17935 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76745/ Test FAILed. ---

[GitHub] spark pull request #17918: [SPARK-20678][SQL] Ndv for columns not in filter ...

2017-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17918 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD...

2017-05-10 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/17936 [SPARK-20638][Core][WIP]Optimize the CartesianRDD to reduce repeatedly data fetching ## What changes were proposed in this pull request? This path aims to solve the poor performance of

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17936 **[Test build #3708 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3708/testReport)** for PR 17936 at commit

[GitHub] spark issue #17890: [MINOR][BUILD] Fix lint-java breaks.

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17890 **[Test build #3707 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3707/testReport)** for PR 17890 at commit

[GitHub] spark pull request #17904: [SPARK-20630] [Web UI] Fixed column visibility in...

2017-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17904 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17900: [SPARK-20637][Core] Remove mention of old RDD cla...

2017-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17900 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17936 From my first glance, I have several questions: 1. If the parent's partition has already been cached in local blockmanager, do we need to cache again? 2. There will be situation

[GitHub] spark pull request #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, doc...

2017-05-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17934#discussion_r115693156 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala --- @@ -131,7 +132,6 @@ class LinearSVC @Since("2.2.0") ( */

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17934 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76746/ Test FAILed. ---

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17934 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17933: [SPARK-20588][SQL] Cache TimeZone instances per t...

2017-05-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17933#discussion_r115702646 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala --- @@ -954,8 +955,9 @@ case class

[GitHub] spark pull request #17933: [SPARK-20588][SQL] Cache TimeZone instances per t...

2017-05-10 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17933#discussion_r115702672 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -98,6 +99,21 @@ object DateTimeUtils { sdf

[GitHub] spark issue #17935: [SPARK-20690][SQL][WIP] Analyzer shouldn't add missing a...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17935 **[Test build #76745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76745/testReport)** for PR 17935 at commit

[GitHub] spark issue #17938: [DOCS][SQL] Document bucketing and partitioning in SQL g...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17938 **[Test build #76748 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76748/testReport)** for PR 17938 at commit

[GitHub] spark issue #17938: [DOCS][SQL] Document bucketing and partitioning in SQL g...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17938 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76748/ Test PASSed. ---

[GitHub] spark issue #17938: [DOCS][SQL] Document bucketing and partitioning in SQL g...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17938 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17926: [MINOR][SQL][PYSPARK] Allow user to specify numSlices in...

2017-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17926 I don't think so (this is Python ... ) for both positional and keyword arguments. (If the new `numSlices` is added in the middle of the arguments it will break for positional arguments but this

[GitHub] spark issue #17936: [SPARK-20638][Core][WIP]Optimize the CartesianRDD to red...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17936 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17934 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76742/ Test FAILed. ---

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17934 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17934: [SPARK-20501] [ML] ML 2.2 QA: New Scala APIs, docs

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17934 **[Test build #76742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76742/testReport)** for PR 17934 at commit

[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be a chil...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17858 **[Test build #76744 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76744/testReport)** for PR 17858 at commit

[GitHub] spark issue #17923: [SPARK-20591][WEB UI] Succeeded tasks num not equal in a...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17923 The two displays are at least inconsistent. If displaying "50/50" instead of "53/50" is intentional and been the behavior for a while, let's stick with that. However if some page still shows things

[GitHub] spark issue #17932: [SPARK-20689][PYSPARK] python doctest leaking bucketed t...

2017-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17932 LGTM as targeted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17686: [SPARK-20393][Webu UI] Strengthen Spark to prevent XSS v...

2017-05-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17686 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-10 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17742 BLAS3 with still keeping the output size as `n x m` rather than `n x k` results in massively more shuffle data - I don't think any solution based on exploding the intermediate data so much can be as

[GitHub] spark issue #17933: [SPARK-20588][SQL] Cache TimeZone instances per thread.

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17933 **[Test build #76747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76747/testReport)** for PR 17933 at commit

[GitHub] spark issue #17928: [SPARK-20311][SQL] Support aliases for table value funct...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17928 **[Test build #76737 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76737/testReport)** for PR 17928 at commit

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76739/ Test PASSed. ---

[GitHub] spark issue #17930: [SPARK-20688][SQL] correctly check analysis for scalar s...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17930 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17938: [DOCS][SQL] Document bucketing and partitioning in SQL g...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17938 **[Test build #76748 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76748/testReport)** for PR 17938 at commit

[GitHub] spark issue #17931: [SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name ...

2017-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17931 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76741/ Test PASSed. ---

[GitHub] spark issue #17931: [SPARK-12837][SPARK-20666][CORE][FOLLOWUP] getting name ...

2017-05-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17931 **[Test build #76741 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76741/testReport)** for PR 17931 at commit

  1   2   3   4   5   >