date:20170918

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139346721 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -117,17 +122,37 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r13935 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -422,208 +456,101 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139353082 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139358597 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya

Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139359087 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19234 I think it was changed back in https://github.com/apache/spark/pull/7363. I think that avoids precision loss from Python's `float` limitation: ```python >>> decimal.Decimal(1 / 1e6)

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81869/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81869/ Test FAILed. ---

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-18 Thread kiszk

Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r139361387 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala --- @@ -0,0 +1,201 @@ +/* + * Licensed to the

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139361487 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java --- @@ -147,6 +147,11 @@ private void

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread kiszk

Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19230 Can we add test code for `null` row in a column for each type? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139362028 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/CompressibleColumnAccessor.scala --- @@ -17,8 +17,11 @@

[GitHub] spark pull request #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVer...

2017-09-18 Thread cloud-fan

GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19264 [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? As reported in https://issues.apache.org/jira/browse/SPARK-22047 ,

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19264 **[Test build #81873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81873/testReport)** for PR 19264 at commit

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19219 **[Test build #81874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81874/testReport)** for PR 19219 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19264 merging to master/2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19264 cc @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81903/ Test PASSed. ---

[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19243 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread ueshin

Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139579800 --- Diff: python/pyspark/worker.py --- @@ -71,7 +73,19 @@ def wrap_udf(f, return_type): return lambda *a: f(*a) -def

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread ueshin

Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139580569 --- Diff: python/pyspark/worker.py --- @@ -71,7 +73,19 @@ def wrap_udf(f, return_type): return lambda *a: f(*a) -def

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread ueshin

Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139583530 --- Diff: python/pyspark/sql/tests.py --- @@ -3122,6 +3122,185 @@ def test_filtered_frame(self): self.assertTrue(pdf.empty)

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81901 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81901/testReport)** for PR 19265 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81901/ Test PASSed. ---

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19130: [SPARK-21917][CORE][YARN] Supporting adding http(s) reso...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19130 **[Test build #81905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81905/testReport)** for PR 19130 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81902 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81902/testReport)** for PR 19265 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81902/ Test PASSed. ---

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-18 Thread ueshin

Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18659 @BryanCutler I'm ok to upgrade pyarrow to 0.7 except for the same concerns as #18974. I guess we need to discuss upgrade policy and strategy of pyarrow. ---

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139585787 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139585713 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139585473 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139585376 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139586600 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala --- @@ -276,12 +315,32 @@ object TrainValidationSplitModel extends

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123

Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Oh, I am not saying the compatibility against old version scala application. What I say is about new version `Bucketizer`, when spark user use java language(not scala language), call

[GitHub] spark issue #19247: [Spark-21996][SQL] read files with space in name for str...

2017-09-18 Thread xysun

Github user xysun commented on the issue: https://github.com/apache/spark/pull/19247 @joseph-torres @brkyvz @lw-lin can you please take a look? (sorry for uninvited mentions but i just took the latest commits on `FileStreamSource`) ---

[GitHub] spark issue #19068: [SPARK-21428][SQL][FOLLOWUP]CliSessionState should point...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19068 **[Test build #81906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81906/testReport)** for PR 19068 at commit

[GitHub] spark pull request #16656: [SPARK-18116][DStream] Report stream input inform...

2017-09-18 Thread uncleGen

Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/16656 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139575943 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139576095 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139576228 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -385,4 +385,13 @@ package object config { .checkValue(v =>

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-18 Thread kiszk

Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18704 @cloud-fan Could you please review this again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19242: [CORE][DOC]Add event log conf.

2017-09-18 Thread guoxiaolongzte

Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19242 @srowen Help to review the code, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread jerryshao

Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139576893 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread jerryshao

Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139576814 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark issue #19256: [SPARK-21338][SQL]implement isCascadingTruncateTable() m...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19256 **[Test build #81898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81898/testReport)** for PR 19256 at commit

[GitHub] spark issue #19256: [SPARK-21338][SQL]implement isCascadingTruncateTable() m...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19256 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread jerryshao

Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139577191 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,54 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark issue #19256: [SPARK-21338][SQL]implement isCascadingTruncateTable() m...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19256 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81898/ Test PASSed. ---

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-18 Thread jerryshao

Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r139577257 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -385,4 +385,13 @@ package object config { .checkValue(v =>

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123

Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Scala `with trait` is a complex mechanism and `trait` isn't equivalent to java's `interface`. Scala compiler will precompile and generate many other codes, so java-side code cannot

[GitHub] spark pull request #19196: [SPARK-21977] SinglePartition optimizations break...

2017-09-18 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19196#discussion_r139577945 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/IncrementalExecutionRulesSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark pull request #19196: [SPARK-21977] SinglePartition optimizations break...

2017-09-18 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19196#discussion_r139577898 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/IncrementalExecutionRulesSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark pull request #19196: [SPARK-21977] SinglePartition optimizations break...

2017-09-18 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19196#discussion_r139578096 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/IncrementalExecutionRulesSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark pull request #19196: [SPARK-21977] SinglePartition optimizations break...

2017-09-18 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19196#discussion_r139578010 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/IncrementalExecutionRulesSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark pull request #19196: [SPARK-21977] SinglePartition optimizations break...

2017-09-18 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19196#discussion_r139578185 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/IncrementalExecutionRulesSuite.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the

[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19243 **[Test build #81903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81903/testReport)** for PR 19243 at commit

[GitHub] spark issue #19074: [SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading ...

2017-09-18 Thread jerryshao

Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19074 @loneknightpy did you open a new JIRA about this issue? AFAIK, downloading resources to local disk is not supported for cluster mode even from beginning, would you please elaborate the

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18659 **[Test build #81899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81899/testReport)** for PR 18659 at commit

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139578700 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala --- @@ -276,12 +315,32 @@ object TrainValidationSplitModel extends

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139573779 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -261,17 +290,40 @@ class CrossValidatorModel private[ml] (

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81899/ Test FAILed. ---

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18659 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread wzhfy

Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15544 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139556318 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/SharedParamsCodeGen.scala --- @@ -82,7 +82,10 @@ private[shared] object

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139568979 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -237,12 +251,17 @@ object CrossValidator extends

[GitHub] spark pull request #19208: [SPARK-21087] [ML] CrossValidator, TrainValidatio...

2017-09-18 Thread smurching

Github user smurching commented on a diff in the pull request: https://github.com/apache/spark/pull/19208#discussion_r139557219 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -117,6 +123,12 @@ class CrossValidator @Since("1.2.0")

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81904 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81904/testReport)** for PR 15544 at commit

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread fjh100456

Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Thanks for your review. @gatorsmile In the first question I mean that âparquet.compressionâ can be found in the `table: Tabledesc` (maybe similar with `catalogtable`), and can also be

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19271 **[Test build #81900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81900/testReport)** for PR 19271 at commit

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19271: [SPARK-22053][SS] Stream-stream inner join in Append Mod...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19271 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81900/ Test FAILed. ---

[GitHub] spark issue #19208: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

2017-09-18 Thread WeichenXu123

Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19208 @smurching Thanks! I will update later. And note that I will separate part of this PR to a new PR (the separated part will be a bugfix for #16774 ) ---

[GitHub] spark issue #18945: Add option to convert nullable int columns to float colu...

2017-09-18 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18945 gentle ping @logannc. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread maropu

Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139580567 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/LongArray.java --- @@ -39,7 +39,7 @@ private final long length;

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya

Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 Sorry I have to reply on a phone, so I may not write codes smoothly. What I mean it doesn't break binary compatibility, is the existing users codes using Bucketizer don't need to

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread jerryshao

Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 LGTM, merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-18 Thread maropu

Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19266#discussion_r139582421 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/array/LongArray.java --- @@ -39,7 +39,7 @@ private final long length;

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18887 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81890/ Test PASSed. ---

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18887 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19238: [SPARK-22016][SQL] Add HiveDialect for JDBC connection t...

2017-09-18 Thread danielfx90

Github user danielfx90 commented on the issue: https://github.com/apache/spark/pull/19238 Seems logical. Then, unless someone disagrees, feel free to close this PR and we will create a new spark package with this feature in a new repository. Thanks! ---

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-18 Thread tgravescs

Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139524259 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -39,20 +41,13 @@ import org.apache.spark.util.Utils * has

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81889/ Test FAILed. ---

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #81889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81889/testReport)** for PR 19222 at commit

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19211 **[Test build #81895 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81895/testReport)** for PR 19211 at commit

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19270 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-18 Thread pgandhi999

Github user pgandhi999 commented on the issue: https://github.com/apache/spark/pull/19270 @ajbozarth Thank you for your comment on the previous PR. I have closed that one. Apologies for the confusion caused in the previous PR! ---

[GitHub] spark pull request #19210: [SPARK-22030][CORE] GraphiteSink fails to re-conn...

2017-09-18 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19210 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19243: [SPARK-21780][R] Simpler Dataset.sample API in R

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19243 **[Test build #81903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81903/testReport)** for PR 19243 at commit

[GitHub] spark issue #19074: [SPARK-21714][CORE][BACKPORT-2.2] Avoiding re-uploading ...

2017-09-18 Thread jiangxb1987

Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19074 ping @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #18754: [WIP][SPARK-21552][SQL] Add DecimalType support t...

2017-09-18 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18754#discussion_r139562172 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala --- @@ -224,6 +226,25 @@ private[arrow] class

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139562519 --- Diff: python/pyspark/sql/tests.py --- @@ -3122,6 +3122,185 @@ def test_filtered_frame(self): self.assertTrue(pdf.empty)

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-18 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #81894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81894/testReport)** for PR 19194 at commit

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139562988 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,46 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #16774: [SPARK-19357][ML] Adding parallel model evaluation in ML...

2017-09-18 Thread BryanCutler

Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/16774 @WeichenXu123 , it would be great if you could separate out the bugfix. I looked in #19208 but couldn't find what you were referring to. ---

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya

Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @WeichenXu123 Thanks for verifying that. Do you mean using ApproxQuantiles to compute mean and median? But I think this change is not intended to improve this part. ---

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19211 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

< 1 2 3 4 5 >

301 - 400 of 433 matches

Mail list logo