[GitHub] spark issue #18742: [Spark-21542][ML][Python]Python persistence helper funct...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18742 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80145/ Test FAILed. ---

[GitHub] spark issue #18742: [Spark-21542][ML][Python]Python persistence helper funct...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18742 **[Test build #80145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80145/testReport)** for PR 18742 at commit

[GitHub] spark pull request #18806: [SPARK-21600] The description of "this requires s...

2017-08-01 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/18806#discussion_r130777347 --- Diff: docs/configuration.md --- @@ -1638,7 +1638,7 @@ Apart from these, the following properties are also available, and may be useful For

[GitHub] spark issue #18807: [SPARK-21601][BUILD] Modify the pom.xml file, increase t...

2017-08-01 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/18807 These are maven-compiler-plugin configurations. We don't use maven-compiler-plugin to compile Java code: https://github.com/apache/spark/commit/74cda94c5e496e29f42f1044aab90cab7dbe9d38 --- If

[GitHub] spark issue #18780: [INFRA] Close stale PRs

2017-08-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18780 Please take [SPARK-21287] out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18800 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18800 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80131/ Test FAILed. ---

[GitHub] spark pull request #18804: [SPARK-21599][SQL] Collecting column statistics f...

2017-08-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18804#discussion_r130783022 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -642,8 +642,15 @@ private[spark] class

[GitHub] spark pull request #18804: [SPARK-21599][SQL] Collecting column statistics f...

2017-08-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18804#discussion_r130783003 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -642,8 +642,15 @@ private[spark] class

[GitHub] spark issue #18796: [CORE] [MINOR] Improve the error message of checkpoint R...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18796 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18796: [CORE] [MINOR] Improve the error message of checkpoint R...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18796 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80124/ Test PASSed. ---

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18798 **[Test build #80126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80126/testReport)** for PR 18798 at commit

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18468 **[Test build #80128 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80128/testReport)** for PR 18468 at commit

[GitHub] spark issue #18799: [SPARK-21596][SS]Ensure places calling HDFSMetadataLog.g...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18799 **[Test build #80129 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80129/testReport)** for PR 18799 at commit

[GitHub] spark issue #18799: [SPARK-21596][SS]Ensure places calling HDFSMetadataLog.g...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18799 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80129/ Test PASSed. ---

[GitHub] spark pull request #18800: [SPARK-21330][SQL] Bad partitioning does not allo...

2017-08-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18800#discussion_r130704709 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -376,6 +385,13 @@ class JDBCSuite extends SparkFunSuite

[GitHub] spark pull request #18800: [SPARK-21330][SQL] Bad partitioning does not allo...

2017-08-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18800#discussion_r130704669 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala --- @@ -64,7 +64,8 @@ private[sql] object JDBCRelation

[GitHub] spark issue #18791: [SPARK-21571][Scheduler] Spark history server leaves inc...

2017-08-01 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/18791 Overall I like this as an option and the code looks good. Personally I use one log directory for all my logs (not just the SHS) so this wouldn't work for me, but I also run into dead files

[GitHub] spark issue #18796: [CORE] [MINOR] Improve the error message of checkpoint R...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18796 **[Test build #80124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80124/testReport)** for PR 18796 at commit

[GitHub] spark issue #18708: [SPARK-21339] [CORE] spark-shell --packages option does ...

2017-08-01 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18708 Actually, spotted something... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130746993 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #18749: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples an...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18749 **[Test build #80123 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80123/testReport)** for PR 18749 at commit

[GitHub] spark pull request #18708: [SPARK-21339] [CORE] spark-shell --packages optio...

2017-08-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18708 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18801: SPARK-10878 Fix race condition when multiple clients res...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18801 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18803: [SPARK-21597][SS]Fix a potential overflow issue i...

2017-08-01 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/18803 [SPARK-21597][SS]Fix a potential overflow issue in EventTimeStats ## What changes were proposed in this pull request? This PR fixed a potential overflow issue in EventTimeStats.

[GitHub] spark issue #18696: [SPARK-21490][core] Make sure SparkLauncher redirects ne...

2017-08-01 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18696 I'll wait until tomorrow for a review, otherwise I'll push this to unblock further work on this code. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread wesm
Github user wesm commented on the issue: https://github.com/apache/spark/pull/18664 For item 2, in Arrow-land if the data is time zone aware, then it must be internally normalized to UTC. Conversions are therefore metadata-only operations and do not require any computation. The

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18800 **[Test build #80130 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80130/testReport)** for PR 18800 at commit

[GitHub] spark issue #18796: [CORE] [MINOR] Improve the error message of checkpoint R...

2017-08-01 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18796 +1, LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #18799: [SPARK-21596][SS]Ensure places calling HDFSMetadataLog.g...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18799 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18800 **[Test build #80131 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80131/testReport)** for PR 18800 at commit

[GitHub] spark issue #18783: [SPARK-21254] [WebUI] History UI performance fixes

2017-08-01 Thread 2ooom
Github user 2ooom commented on the issue: https://github.com/apache/spark/pull/18783 Could we try to run tests and build on Jenkins? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18800 **[Test build #80130 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80130/testReport)** for PR 18800 at commit

[GitHub] spark issue #18695: [SPARK-12717][PYTHON] Adding thread-safe broadcast pickl...

2017-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18695 Thanks for clarification. LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17373: [SPARK-12664] Expose probability in mlp model

2017-08-01 Thread MrBago
Github user MrBago commented on a diff in the pull request: https://github.com/apache/spark/pull/17373#discussion_r130747030 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -363,7 +363,7 @@ private[ann] trait TopologyModel extends Serializable { *

[GitHub] spark pull request #17373: [SPARK-12664] Expose probability in mlp model

2017-08-01 Thread MrBago
Github user MrBago commented on a diff in the pull request: https://github.com/apache/spark/pull/17373#discussion_r130746928 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -463,7 +479,7 @@ private[ml] class FeedForwardModel private( private var

[GitHub] spark pull request #17373: [SPARK-12664] Expose probability in mlp model

2017-08-01 Thread MrBago
Github user MrBago commented on a diff in the pull request: https://github.com/apache/spark/pull/17373#discussion_r130747996 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala --- @@ -82,6 +83,23 @@ class

[GitHub] spark pull request #17373: [SPARK-12664] Expose probability in mlp model

2017-08-01 Thread MrBago
Github user MrBago commented on a diff in the pull request: https://github.com/apache/spark/pull/17373#discussion_r130746665 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -463,7 +479,7 @@ private[ml] class FeedForwardModel private( private var

[GitHub] spark pull request #18795: Fix Java SimpleApp spark application

2017-08-01 Thread christiam
Github user christiam commented on a diff in the pull request: https://github.com/apache/spark/pull/18795#discussion_r130706349 --- Diff: docs/quick-start.md --- @@ -330,6 +331,10 @@ Note that Spark artifacts are tagged with a Scala version. Simple Project jar

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18468 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80128/ Test PASSed. ---

[GitHub] spark issue #18468: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18468 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18802: [SPARK-18535][SPARK-19720][CORE][BACKPORT-2.1] Re...

2017-08-01 Thread dmvieira
GitHub user dmvieira opened a pull request: https://github.com/apache/spark/pull/18802 [SPARK-18535][SPARK-19720][CORE][BACKPORT-2.1] Redact sensitive information ## What changes were proposed in this pull request? Backporting SPARK-18535 and SPARK-19720 to spark 2.1

[GitHub] spark pull request #18765: [SPARK-19720][CORE][BACKPORT-2.1] Redact sensitiv...

2017-08-01 Thread dmvieira
Github user dmvieira commented on a diff in the pull request: https://github.com/apache/spark/pull/18765#discussion_r130733138 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2571,6 +2572,23 @@ private[spark] object Utils extends Logging {

[GitHub] spark issue #18395: [SPARK-20655][core] In-memory KVStore implementation.

2017-08-01 Thread jsoltren
Github user jsoltren commented on the issue: https://github.com/apache/spark/pull/18395 This looks fine to me. Though, I did have to ask Marcelo a number of questions along the way to clarify what all was going on here. Maybe some of the points I mention below should make their way

[GitHub] spark issue #18803: [SPARK-21597][SS]Fix a potential overflow issue in Event...

2017-08-01 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/18803 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18695: [SPARK-12717][PYTHON] Adding thread-safe broadcast pickl...

2017-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18695 @holdenk, BTW, it looks I am facing the same issue you met before. Sounds I can't trigger the Jenkins build by "ok to test". Do you maybe know who I should ask and or some steps that I should

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18797 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80125/ Test FAILed. ---

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18797 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18791: [SPARK-21571][Scheduler] Spark history server lea...

2017-08-01 Thread ajbozarth
Github user ajbozarth commented on a diff in the pull request: https://github.com/apache/spark/pull/18791#discussion_r130712378 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala --- @@ -134,7 +134,8 @@ class FsHistoryProviderSuite extends

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18790 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80127/ Test PASSed. ---

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18790 **[Test build #80127 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80127/testReport)** for PR 18790 at commit

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18790 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18783: [SPARK-21254] [WebUI] History UI performance fixes

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18783 **[Test build #80133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80133/testReport)** for PR 18783 at commit

[GitHub] spark issue #18803: [SPARK-21597][SS]Fix a potential overflow issue in Event...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18803 **[Test build #80132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80132/testReport)** for PR 18803 at commit

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130742759 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130742524 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130742836 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130742319 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130741880 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r130745275 --- Diff: python/pyspark/ml/util.py --- @@ -283,3 +289,124 @@ def numFeatures(self): Returns the number of features the model was trained

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130746893 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/18798 @thunterdb 1) The dataframe deserialize from binary data will add overhead, (maybe there is compaction or not, it depends on the datatype, cc @liancheng ) about 1x performance in my test.

[GitHub] spark issue #18749: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples an...

2017-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18749 @srowen, @rxin and @cloud-fan, I am confident for the compatibility concern per https://github.com/apache/spark/pull/18749#issuecomment-319364607 and believe it is ready for another look. ---

[GitHub] spark issue #18749: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples an...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18749 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80123/ Test PASSed. ---

[GitHub] spark issue #18749: [SPARK-21485][FOLLOWUP][SQL][DOCS] Describes examples an...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18749 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18708: [SPARK-21339] [CORE] spark-shell --packages option does ...

2017-08-01 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18708 Merging to master / 2.2. I'll fix a style nit while merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18745: [SPARK-21544][DEPLOY] Tests jar of some module should no...

2017-08-01 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18745 Can you add `[test-maven]` to the PR title so it builds with maven? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130741348 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80134/testReport)** for PR 18664 at commit

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18797 **[Test build #80125 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80125/testReport)** for PR 18797 at commit

[GitHub] spark pull request #18801: SPARK-10878 Fix race condition when multiple clie...

2017-08-01 Thread Victsm
GitHub user Victsm opened a pull request: https://github.com/apache/spark/pull/18801 SPARK-10878 Fix race condition when multiple clients resolves artifacts at the same time ## What changes were proposed in this pull request? When multiple clients attempt to resolve

[GitHub] spark issue #18802: [SPARK-18535][SPARK-19720][CORE][BACKPORT-2.1] Redact se...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18802 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18783: [SPARK-21254] [WebUI] History UI performance fixes

2017-08-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18783 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/18798 Thank you for the performance numbers @WeichenXu123 , I have a couple of comments: - you say that SQL uses adaptive compaction. How bad is that? I assume it adds some overhead. - did

[GitHub] spark pull request #17419: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb closed the pull request at: https://github.com/apache/spark/pull/17419 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/18798 cc @hvanhovell as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130742933 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #18695: [SPARK-12717][PYTHON] Adding thread-safe broadcast pickl...

2017-08-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18695 Thanks. Merged to mater. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130743131 --- Diff: mllib/src/test/scala/org/apache/spark/ml/stat/SummarizerSuite.scala --- @@ -0,0 +1,619 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #18695: [SPARK-12717][PYTHON] Adding thread-safe broadcas...

2017-08-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18695 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80126/ Test PASSed. ---

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/18798 performance data attached. cc @thunterdb @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 To Wes's concern, I think we are only dealing with values in UTC here, both Spark and Arrow internally represents timestamp as microseconds since epoch. To the two issues Bryan and

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18800 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80130/ Test PASSed. ---

[GitHub] spark issue #18800: [SPARK-21330][SQL] Bad partitioning does not allow to re...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18800 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17419: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-01 Thread thunterdb
Github user thunterdb commented on the issue: https://github.com/apache/spark/pull/17419 I am going to close this PR, since this is being taken over by @WeichenXu123 in #18798 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #18742: [Spark-21542][ML][Python]Python persistence helpe...

2017-08-01 Thread ajaysaini725
Github user ajaysaini725 commented on a diff in the pull request: https://github.com/apache/spark/pull/18742#discussion_r130741050 --- Diff: python/pyspark/ml/util.py --- @@ -283,3 +289,124 @@ def numFeatures(self): Returns the number of features the model was trained

[GitHub] spark pull request #18791: [SPARK-21571][Scheduler] Spark history server lea...

2017-08-01 Thread ericvandenbergfb
Github user ericvandenbergfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18791#discussion_r130743830 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala --- @@ -134,7 +134,8 @@ class FsHistoryProviderSuite

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-01 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r130747756 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,633 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #18796: [CORE] [MINOR] Improve the error message of checkpoint R...

2017-08-01 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/18796 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18793: [SPARK-21593][DOCS] Fix 2 rendering errors on configurat...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18793 **[Test build #80121 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80121/testReport)** for PR 18793 at commit

[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #80117 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80117/testReport)** for PR 18555 at commit

[GitHub] spark issue #18783: [SPARK-21254] [WebUI] History UI performance fixes

2017-08-01 Thread highfei2011
Github user highfei2011 commented on the issue: https://github.com/apache/spark/pull/18783 It looks a lot simpler. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18786: [SPARK-21584][SQL][SparkR] Update R method for summary t...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18786 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80120/ Test PASSed. ---

[GitHub] spark issue #18786: [SPARK-21584][SQL][SparkR] Update R method for summary t...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18786 **[Test build #80120 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80120/testReport)** for PR 18786 at commit

[GitHub] spark issue #18793: [SPARK-21593][DOCS] Fix 2 rendering errors on configurat...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18793 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18555 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80117/ Test PASSed. ---

[GitHub] spark issue #18793: [SPARK-21593][DOCS] Fix 2 rendering errors on configurat...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18793 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80121/ Test PASSed. ---

<    1   2   3   4   5   >