[GitHub] spark pull request #18032: [SPARK-20806][DEPLOY] Launcher: redundant check f...

2017-05-19 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/18032 [SPARK-20806][DEPLOY] Launcher: redundant check for Spark lib dir ## What changes were proposed in this pull request? Remove redundant check for libdir in CommandBuilderUtils ##

[GitHub] spark pull request #18030: [SPARK-20798] GenerateUnsafeProjection should che...

2017-05-19 Thread ala
Github user ala commented on a diff in the pull request: https://github.com/apache/spark/pull/18030#discussion_r117433160 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -50,10 +50,15 @@ object

[GitHub] spark pull request #18033: Add compression/decompression of column data to C...

2017-05-19 Thread kiszk
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/18033 Add compression/decompression of column data to ColumnVector ## What changes were proposed in this pull request? This PR adds compression/decompression of column data to `ColumnVector`.

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17936 How much difference this performs, compared with caching the two RDDs before doing cartesian? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17880: [SPARK-20620][TEST]Improve some unit tests for NullExpre...

2017-05-19 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/17880 I have modify `Scala style`. Test is not started, could you help trigger it,thanks @HyukjinKwon @gatorsmile --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #18033: Add compression/decompression of column data to ColumnVe...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18033 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18034: [SPARK-20797][MLLIB]fix LocalLDAModel.save() bug.

2017-05-19 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18034#discussion_r117443669 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala --- @@ -468,7 +469,16 @@ object LocalLDAModel extends Loader[LocalLDAModel]

[GitHub] spark issue #18034: [SPARK-20797][MLLIB]fix LocalLDAModel.save() bug.

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18034 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17455: [Spark-20044][Web UI] Support Spark UI behind front-end ...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17455 **[Test build #3730 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3730/testReport)** for PR 17455 at commit

[GitHub] spark issue #18035: [MINOR][SPARKR][ML] Fix coefficients issue and code clea...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18035 **[Test build #77094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77094/testReport)** for PR 18035 at commit

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17936 I agreed with @srowen. This adds quite complexity. If there is no much difference comparing with caching RDDs before doing cartesian (or other ways), it may not worth such complexity. --- If your

[GitHub] spark pull request #18031: [SPARK-20801] Record accurate size of blocks in M...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/18031#discussion_r117443085 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus { }

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 `Broadcast` should first fetch the all block to driver, and cached in the local, then the executor fetch it from the driver. I think it's really time consuming. --- If your project is set up for

[GitHub] spark issue #17868: [SPARK-20607][CORE]Add new unit tests to ShuffleSuite

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17868 **[Test build #3734 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3734/testReport)** for PR 17868 at commit

[GitHub] spark issue #17940: [SPARK-20687][MLLIB] mllib.Matrices.fromBreeze may crash...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17940 **[Test build #3733 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3733/testReport)** for PR 17940 at commit

[GitHub] spark issue #17869: [SPARK-20609][CORE]Run the SortShuffleSuite unit tests h...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17869 **[Test build #3736 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3736/testReport)** for PR 17869 at commit

[GitHub] spark issue #18013: [SPARK-20781] the location of Dockerfile in docker.prope...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18013 **[Test build #3735 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3735/testReport)** for PR 18013 at commit

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17936 @viirya , this is slightly different from caching RDD. It is more like broadcasting, the final state is that each executor will hold the whole data of RDD2, the difference is that this is

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 Sorry for the mistake, this test result should be the cached situation: | --| -- | -- | | 15.877s | 2827.373s | 178x | | 16.781s | 2809.502s | 167x | | 16.320s |

[GitHub] spark issue #17698: [SPARK-20403][SQL]Modify the instructions of some functi...

2017-05-19 Thread 10110346
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/17698 @gatorsmile I have added test cases to the file `cast.sql` , thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17936 @jerryshao As you mentioned broadcasting, another question might be, can we just use broadcasting to achieve similar performance without such changes? --- If your project is set up for it, you can

[GitHub] spark issue #17992: [SPARK-20759] SCALA_VERSION in _config.yml should be con...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17992 **[Test build #3732 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3732/testReport)** for PR 17992 at commit

[GitHub] spark pull request #18035: [MINOR][SPARKR][ML] Fix coefficients issue and co...

2017-05-19 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/18035 [MINOR][SPARKR][ML] Fix coefficients issue and code cleanup for SparkR linear SVM. ## What changes were proposed in this pull request? Fix coefficients issue and code cleanup for SparkR

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-19 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18014 @cloud-fan What would you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 OK, I'll add it. From the test data, performance is still very obvious. Mainly from the network and disk overhead. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #18030: [SPARK-20798] GenerateUnsafeProjection should check if a...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18030 **[Test build #77090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77090/testReport)** for PR 18030 at commit

[GitHub] spark pull request #18031: [SPARK-20801] Record accurate size of blocks in M...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/18031#discussion_r117440029 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus { }

[GitHub] spark pull request #18031: [SPARK-20801] Record accurate size of blocks in M...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/18031#discussion_r117440204 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -121,48 +126,69 @@ private[spark] class CompressedMapStatus( }

[GitHub] spark issue #18032: [SPARK-20806][DEPLOY] Launcher: redundant check for Spar...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18032 **[Test build #77089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77089/testReport)** for PR 18032 at commit

[GitHub] spark pull request #17936: [SPARK-20638][Core]Optimize the CartesianRDD to r...

2017-05-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17936#discussion_r117429928 --- Diff: core/src/main/scala/org/apache/spark/rdd/CartesianRDD.scala --- @@ -71,9 +72,92 @@ class CartesianRDD[T: ClassTag, U: ClassTag]( }

[GitHub] spark pull request #17936: [SPARK-20638][Core]Optimize the CartesianRDD to r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/17936#discussion_r117432268 --- Diff: core/src/main/scala/org/apache/spark/rdd/CartesianRDD.scala --- @@ -71,9 +72,92 @@ class CartesianRDD[T: ClassTag, U: ClassTag]( }

[GitHub] spark issue #18004: [SPARK-18838][CORE] Introduce blocking strategy for Live...

2017-05-19 Thread bOOm-X
Github user bOOm-X commented on the issue: https://github.com/apache/spark/pull/18004 @markhamstra, @vanzin: Can I have a review please ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18016: [SPARK-20786][SQL]Improve ceil and floor handle the valu...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18016 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18016: [SPARK-20786][SQL]Improve ceil and floor handle the valu...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18016 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77087/ Test PASSed. ---

[GitHub] spark pull request #18030: [SPARK-20798] GenerateUnsafeProjection should che...

2017-05-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18030 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17936: [SPARK-20638][Core]Optimize the CartesianRDD to r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/17936#discussion_r117427240 --- Diff: core/src/main/scala/org/apache/spark/rdd/CartesianRDD.scala --- @@ -71,9 +72,92 @@ class CartesianRDD[T: ClassTag, U: ClassTag]( }

[GitHub] spark issue #17992: [SPARK-20759] SCALA_VERSION in _config.yml should be con...

2017-05-19 Thread liu-zhaokun
Github user liu-zhaokun commented on the issue: https://github.com/apache/spark/pull/17992 @srowen Hi,do you know why this PR can't pass the test? I don't think it's my problem. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17880: [SPARK-20620][TEST]Improve some unit tests for NullExpre...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17880 **[Test build #3731 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3731/testReport)** for PR 17880 at commit

[GitHub] spark pull request #18034: [SPARK-20797][MLLIB]fix LocalLDAModel.save() bug.

2017-05-19 Thread d0evi1
GitHub user d0evi1 opened a pull request: https://github.com/apache/spark/pull/18034 [SPARK-20797][MLLIB]fix LocalLDAModel.save() bug. ## What changes were proposed in this pull request? LocalLDAModel's model save function has a bug: please see:

[GitHub] spark issue #18033: Add compression/decompression of column data to ColumnVe...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18033 **[Test build #77091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77091/testReport)** for PR 18033 at commit

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17770 **[Test build #77088 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77088/testReport)** for PR 17770 at commit

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17770 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77088/ Test PASSed. ---

[GitHub] spark issue #18033: [SPARK-20807][SQL] Add compression/decompression of colu...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18033 **[Test build #77092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77092/testReport)** for PR 18033 at commit

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17770 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18011: [SPARK-19089][SQL] Add support for nested sequences

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18011 **[Test build #77093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77093/testReport)** for PR 18011 at commit

[GitHub] spark issue #17923: [SPARK-20591][WEB UI] Succeeded tasks num not equal in a...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17923 **[Test build #3737 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3737/testReport)** for PR 17923 at commit

[GitHub] spark pull request #18031: [SPARK-20801] Record accurate size of blocks in M...

2017-05-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18031#discussion_r117461089 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus {

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 Yeah, I think I can do the performance comparison. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18033: Add compression/decompression of column data to ColumnVe...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18033 **[Test build #77091 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77091/testReport)** for PR 18033 at commit

[GitHub] spark issue #18033: Add compression/decompression of column data to ColumnVe...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18033 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77091/ Test FAILed. ---

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17936 @jerryshao Yeah, the reason I mentioned caching is to know how much re-computing RDD costs in the performance. It seems to me that if re-computing is much more costing than transferring the data,

[GitHub] spark issue #18030: [SPARK-20798] GenerateUnsafeProjection should check if a...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18030: [SPARK-20798] GenerateUnsafeProjection should check if a...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77090/ Test PASSed. ---

[GitHub] spark issue #18032: [SPARK-20806][DEPLOY] Launcher: redundant check for Spar...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18032 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18030: [SPARK-20798] GenerateUnsafeProjection should check if a...

2017-05-19 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/18030 LGTM - merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17880: [SPARK-20620][TEST]Improve some unit tests for NullExpre...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17880 **[Test build #3731 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3731/testReport)** for PR 17880 at commit

[GitHub] spark pull request #18031: [SPARK-20801] Record accurate size of blocks in M...

2017-05-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18031#discussion_r117460170 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +219,27 @@ private[spark] object HighlyCompressedMapStatus {

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17936 I see. I think at least we should make this cache mechanism controllable by flag. I'm guessing in some HPC clusters or single node cluster this problem is not so severe. --- If your project is

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/17936 I did not directly test this situation. But I have test the this pr compared with latest `ALS`(after merge #17742 ). In `ALS`, the both RDDs are cached, and also grouped the

[GitHub] spark issue #18016: [SPARK-20786][SQL]Improve ceil and floor handle the valu...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18016 **[Test build #77087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77087/testReport)** for PR 18016 at commit

[GitHub] spark issue #17936: [SPARK-20638][Core]Optimize the CartesianRDD to reduce r...

2017-05-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17936 Seems it should be still better than original cartesian, since it saves re-computing RDD, re-transferring data? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18030: [SPARK-20798] GenerateUnsafeProjection should check if a...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18030 **[Test build #77090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77090/testReport)** for PR 18030 at commit

[GitHub] spark issue #18032: [SPARK-20806][DEPLOY] Launcher: redundant check for Spar...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18032 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77089/ Test PASSed. ---

[GitHub] spark issue #18032: [SPARK-20806][DEPLOY] Launcher: redundant check for Spar...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18032 **[Test build #77089 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77089/testReport)** for PR 18032 at commit

[GitHub] spark issue #18023: [SPARK-12139] [SQL] REGEX Column Specification

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18023 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18023: [SPARK-12139] [SQL] REGEX Column Specification

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18023 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77108/ Test PASSed. ---

[GitHub] spark issue #17996: [SPARK-20506][DOCS] 2.2 migration guide

2017-05-19 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17996 right, I reviewed them - [this](https://github.com/apache/spark/pull/17996/files#diff-a9770b923a4959616bc2126d4afd61eaR35) in ML could also affect R -

[GitHub] spark pull request #18038: [MINOR][SPARKRSQL]Remove unnecessary comment in S...

2017-05-19 Thread lys0716
GitHub user lys0716 opened a pull request: https://github.com/apache/spark/pull/18038 [MINOR][SPARKRSQL]Remove unnecessary comment in SqlBase.g4 ## What changes were proposed in this pull request? The issue(https://github.com/antlr/antlr4/issues/781) in the comment is

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16697 **[Test build #77111 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77111/testReport)** for PR 16697 at commit

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16697 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16697 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77111/ Test PASSed. ---

[GitHub] spark pull request #17723: [SPARK-20434][YARN][CORE] Move kerberos delegatio...

2017-05-19 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/17723#discussion_r117595679 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopAccessManager.scala --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-19 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/17966 Sorry I've been out traveling -- I'll try to update this by tonight --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117602233 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -38,29 +38,35 @@ import org.apache.spark.sql.types._

[GitHub] spark pull request #18039: [SPARK-20751][SQL] Add cot test in MathExpression...

2017-05-19 Thread wangyum
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/18039 [SPARK-20751][SQL] Add cot test in MathExpressionsSuite ## What changes were proposed in this pull request? Add cot test in MathExpressionsSuite as

[GitHub] spark issue #18039: [SPARK-20751][SQL] Add cot test in MathExpressionsSuite

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18039 **[Test build #77113 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77113/testReport)** for PR 18039 at commit

[GitHub] spark issue #17981: [SPARK-15767][ML][SparkR] Decision Tree wrapper in Spark...

2017-05-19 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17981 any more comment? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-05-19 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/12646 Jenkins is about to shut down, we can retest this later --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17978: [SPARK-20736][Python] PySpark StringIndexer supports Str...

2017-05-19 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17978 I'd hold this for another 3-4 days just in case.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-05-19 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/12646 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16697: [SPARK-19358][CORE] LiveListenerBus shall log the event ...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16697 **[Test build #77111 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77111/testReport)** for PR 16697 at commit

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17967 **[Test build #77110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77110/testReport)** for PR 17967 at commit

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12646 **[Test build #77112 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77112/testReport)** for PR 12646 at commit

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17966 **[Test build #77114 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77114/testReport)** for PR 17966 at commit

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17966 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17966 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77114/ Test PASSed. ---

[GitHub] spark issue #18023: [SPARK-12139] [SQL] REGEX Column Specification

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18023 **[Test build #77108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77108/testReport)** for PR 18023 at commit

[GitHub] spark pull request #17967: [SPARK-14659][ML] RFormula consistent with R when...

2017-05-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17967#discussion_r117602143 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -38,29 +38,35 @@ import org.apache.spark.sql.types._

[GitHub] spark pull request #18034: [SPARK-20797][MLLIB]fix LocalLDAModel.save() bug.

2017-05-19 Thread d0evi1
Github user d0evi1 commented on a diff in the pull request: https://github.com/apache/spark/pull/18034#discussion_r117602669 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala --- @@ -468,7 +469,16 @@ object LocalLDAModel extends Loader[LocalLDAModel]

[GitHub] spark pull request #16648: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-19 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/16648#discussion_r117602817 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -145,11 +145,85 @@ class CodegenContext {

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-19 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17967 @yanboliang Thanks for the review and suggestion. Makes lots of sense. I made a new commit to address these. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117600840 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -510,6 +510,69 @@ public UTF8String trim() { } }

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117601121 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -510,6 +510,69 @@ public UTF8String trim() { } }

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117601355 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2015,4 +2015,121 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17967 **[Test build #77110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77110/testReport)** for PR 17967 at commit

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117601249 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -730,4 +726,62 @@ public void testToLong() throws

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117601293 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala --- @@ -375,24 +374,61 @@ class

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117600921 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -510,6 +510,69 @@ public UTF8String trim() { } }

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-05-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/12646#discussion_r117600707 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -510,6 +510,69 @@ public UTF8String trim() { } }

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17967 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

  1   2   3   4   >