[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82065/ Test FAILed. ---

[GitHub] spark issue #19315: Updated english.txt word ordering

2017-09-21 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19315 @animenon can you please fix the PR title like what other PR did. Also is this only for better readability or do you fix any other issue? IMO, I found that previous txt is more readable than your

[GitHub] spark issue #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide jump link...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18015 **[Test build #82064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82064/testReport)** for PR 18015 at commit

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread logannc
Github user logannc commented on the issue: https://github.com/apache/spark/pull/18945 I've continued to use @HyukjinKwon 's suggestion because it should be more performant and is capable of handling it without loss of precision. I believe I've addressed your concerns by only

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82063 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82063/testReport)** for PR 18945 at commit

[GitHub] spark pull request #19301: [SPARK-22084][SQL] Fix performance regression in ...

2017-09-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19301#discussion_r140416279 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala --- @@ -72,11 +74,19 @@ object

[GitHub] spark issue #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide jump link...

2017-09-21 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18015 Yes, I'm fine with it. @ajbozarth would you please take another look on this PR? Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark issue #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide jump link...

2017-09-21 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/18015 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-21 Thread DaimonPl
Github user DaimonPl commented on the issue: https://github.com/apache/spark/pull/16578 @mallman how about adding comment explaining why such workaround was done + bug number in parquet-mr ? So in future once that bug is fixed, code can be cleaned. Also maybe it's time to

[GitHub] spark pull request #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide ju...

2017-09-21 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/18015#discussion_r140416046 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/AllExecutionsPage.scala --- @@ -61,7 +59,37 @@ private[ui] class

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82062/ Test FAILed. ---

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82062/testReport)** for PR 18945 at commit

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140415073 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message,

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread logannc
Github user logannc commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140414783 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message,

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82062/testReport)** for PR 18945 at commit

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140414202 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message, msg))

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140414042 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message, msg))

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread logannc
Github user logannc commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140413579 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message,

[GitHub] spark pull request #19204: [SPARK-21981][PYTHON][ML] Added Python interface ...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19204 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19204: [SPARK-21981][PYTHON][ML] Added Python interface for Clu...

2017-09-21 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19204 Merged into master, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82061 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82061/testReport)** for PR 18945 at commit

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82061/ Test FAILed. ---

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82061 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82061/testReport)** for PR 18945 at commit

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread logannc
Github user logannc commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140412857 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message,

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread logannc
Github user logannc commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140412745 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message,

[GitHub] spark pull request #18945: [SPARK-21766][SQL] Convert nullable int columns t...

2017-09-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r140412632 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1761,12 +1761,37 @@ def toPandas(self): raise ImportError("%s\n%s" % (e.message, msg))

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82060/testReport)** for PR 18945 at commit

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82060/ Test FAILed. ---

[GitHub] spark issue #18945: [SPARK-21766][SQL] Convert nullable int columns to float...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18945 **[Test build #82060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82060/testReport)** for PR 18945 at commit

[GitHub] spark pull request #19229: [SPARK-22001][ML][SQL] ImputerModel can do withCo...

2017-09-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19229#discussion_r140412254 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2102,6 +2102,55 @@ class Dataset[T] private[sql]( } /**

[GitHub] spark pull request #19314: [SPARK-22094][SS]processAllAvailable should check...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19314 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 ping @zhengruifeng @WeichenXu123 Any more comments on this? Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19314 Thanks! Merging to master and branch-2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19290 I initially did this, for example, ``` \href{https://spark.apache.org/docs/latest/sparkr.html#data-type-mapping-between- r-and-spark}{Spark Data Types} for available data types.

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19290 Doh, you mean the current status. Yes, I checked. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release ...

2017-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19312#discussion_r140410448 --- Diff: dev/create-release/release-build.sh --- @@ -95,6 +95,28 @@ if [ -z "$SPARK_VERSION" ]; then | grep -v INFO | grep -v WARNING | grep

[GitHub] spark issue #19318: [SPARK-22096][ML] use aggregateByKeyLocally in feature f...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19318 **[Test build #82059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82059/testReport)** for PR 19318 at commit

[GitHub] spark issue #19318: [SPARK-22096][ML] use aggregateByKeyLocally in feature f...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19318 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82059/ Test FAILed. ---

[GitHub] spark issue #19318: [SPARK-22096][ML] use aggregateByKeyLocally in feature f...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19318 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-21 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19290 btw, could you check if haven't already, if `nolint` around the `http` link, roxygen is going to handle that correctly? ---

[GitHub] spark issue #19318: [SPARK-22096][ML] use aggregateByKeyLocally in feature f...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19318 **[Test build #82059 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82059/testReport)** for PR 19318 at commit

[GitHub] spark pull request #19318: [SPARK-22096][ML] use aggregateByKeyLocally in fe...

2017-09-21 Thread VinceShieh
GitHub user VinceShieh opened a pull request: https://github.com/apache/spark/pull/19318 [SPARK-22096][ML] use aggregateByKeyLocally in feature frequency calc… ## What changes were proposed in this pull request? NaiveBayes currently takes aggreateByKey followed by a

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19122 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19122 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82058/ Test PASSed. ---

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19122 **[Test build #82058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82058/testReport)** for PR 19122 at commit

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19317 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19317 cc @VinceShieh --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19317: [SPARK-22098][CORE] Add new method aggregateByKey...

2017-09-21 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19317 [SPARK-22098][CORE] Add new method aggregateByKeyLocally in RDD ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-22096 NaiveBayes

[GitHub] spark issue #19316: [SPARK-22097][CORE]Call serializationStream.close after ...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19316 @cloud-fan Pls take a look. Thanks a lot. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release build s...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19312 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82056/ Test PASSed. ---

[GitHub] spark issue #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release build s...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19312 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r140408246 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -387,11 +387,18 @@ private[spark] class MemoryStore( //

[GitHub] spark issue #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release build s...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19312 **[Test build #82056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82056/testReport)** for PR 19312 at commit

[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19316#discussion_r140408116 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -387,11 +387,18 @@ private[spark] class MemoryStore( //

[GitHub] spark issue #19316: [SPARK-22097][CORE]Call serializationStream.close after ...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19316 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19316: [SPARK-22097][CORE]Call serializationStream.close...

2017-09-21 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/19316 [SPARK-22097][CORE]Call serializationStream.close after we requested enough memory ## What changes were proposed in this pull request? Current code, we close the

[GitHub] spark issue #19168: [SPARK-21956][CORE] Fetch up to max bytes when buf reall...

2017-09-21 Thread caneGuy
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19168 Sorry for replying so late. I add some benchmark testing for this pr @kiszk . And @jerryshao could you help review this pr?Thanks ``` Running benchmark: Benchmark fetch before vs

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19278 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19278 **[Test build #82057 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82057/testReport)** for PR 19278 at commit

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19278 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82057/ Test PASSed. ---

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19278 @jkbradley Sure I tested the backwards compatibility. Part of the reason I changed into `DefaultParamReader.getAndSetParams` is for backwards compatibility. ---

[GitHub] spark issue #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluation for...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19122 **[Test build #82058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82058/testReport)** for PR 19122 at commit

[GitHub] spark pull request #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluat...

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19122#discussion_r140402700 --- Diff: python/pyspark/ml/tests.py --- @@ -836,6 +836,27 @@ def test_save_load_simple_estimator(self): loadedModel =

[GitHub] spark issue #19315: Updated english.txt word ordering

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19315 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19315: Updated english.txt word ordering

2017-09-21 Thread animenon
GitHub user animenon opened a pull request: https://github.com/apache/spark/pull/19315 Updated english.txt word ordering Ordered alphabetically, for better readability. ## What changes were proposed in this pull request? Alphabetical ordering of the stop words.

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19314 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82055/ Test PASSed. ---

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19314 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19314 **[Test build #82055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82055/testReport)** for PR 19314 at commit

[GitHub] spark issue #13794: [SPARK-15574][ML][PySpark] Python meta-algorithms in Sca...

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/13794 cc @srowen Can you help close this ? We won't need this feature for now. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19279: [SPARK-22061] [ML]add pipeline model of SVM

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19279 cc @srowen Can you help close this ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidationSpli...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19278 **[Test build #82057 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82057/testReport)** for PR 19278 at commit

[GitHub] spark pull request #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidat...

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19278#discussion_r140398933 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala --- @@ -159,12 +159,15 @@ class CrossValidatorSuite

[GitHub] spark pull request #19278: [SPARK-22060][ML] Fix CrossValidator/TrainValidat...

2017-09-21 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19278#discussion_r140398857 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tuning/TrainValidationSplitSuite.scala --- @@ -160,11 +160,13 @@ class TrainValidationSplitSuite

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19281 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19281 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82054/ Test PASSed. ---

[GitHub] spark issue #19281: [SPARK-21998][SQL] SortMergeJoinExec did not calculate i...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19281 **[Test build #82054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82054/testReport)** for PR 19281 at commit

[GitHub] spark issue #19293: [SPARK-22079][SQL] Serializer in HiveOutputWriter miss l...

2017-09-21 Thread LantaoJin
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/19293 @gatorsmile @cloud-fan Please help to review --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r140396700 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,55 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2017-09-21 Thread stanzhai
Github user stanzhai commented on the issue: https://github.com/apache/spark/pull/19301 @cenyuhai This is an optimize for physical plan, and your case can be optimized. ```SQL select dt, geohash_of_latlng, sum(mt_cnt), sum(ele_cnt), round(sum(mt_cnt) *

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82053/ Test PASSed. ---

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18659 **[Test build #82053 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82053/testReport)** for PR 18659 at commit

[GitHub] spark issue #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release build s...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19312 **[Test build #82056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82056/testReport)** for PR 19312 at commit

[GitHub] spark issue #19312: [SPARK-22072][SPARK-22071][BUILD]Improve release build s...

2017-09-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19312 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82052/ Test PASSed. ---

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #82052 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82052/testReport)** for PR 19194 at commit

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-09-21 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r140389681 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java --- @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r140389432 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/19314 LGTM! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19314 **[Test build #82055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82055/testReport)** for PR 19314 at commit

[GitHub] spark issue #19314: [SPARK-22094][SS]processAllAvailable should check the qu...

2017-09-21 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19314 cc @marmbrus --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19314: [SPARK-22094][SS]processAllAvailable should check...

2017-09-21 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/19314 [SPARK-22094][SS]processAllAvailable should check the query state ## What changes were proposed in this pull request? `processAllAvailable` should also check the query state and if the

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-09-21 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/19294 +CC @weiqingy Can you try this PR with SHC and see if it works ? That is, remove your current workaround for SPARK-21549 from SHC and try writing to hbase with a spark version patched with

[GitHub] spark pull request #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluat...

2017-09-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19122#discussion_r140375759 --- Diff: python/pyspark/ml/tests.py --- @@ -836,6 +836,27 @@ def test_save_load_simple_estimator(self): loadedModel =

[GitHub] spark pull request #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluat...

2017-09-21 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/19122#discussion_r140375921 --- Diff: python/pyspark/ml/tuning.py --- @@ -14,15 +14,16 @@ # See the License for the specific language governing permissions and #

  1   2   3   4   5   >