[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-13 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209846015 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +95,92 @@ def getLocalProperty(self, key): Get a local property set upstream in the driver,

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22009 **[Test build #94732 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94732/testReport)** for PR 22009 at commit [`f4f85a8`](https://github.com/apache/spark/commit/f4

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2169/

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-13 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22009 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: revi

[GitHub] spark issue #22098: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22098 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94722/ Test PASSed. ---

[GitHub] spark issue #22098: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22098 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22098: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22098 **[Test build #94722 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94722/testReport)** for PR 22098 at commit [`bffce5e`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #94731 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94731/testReport)** for PR 21889 at commit [`51f0dc5`](https://github.com/apache/spark/commit/51

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21889 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@s

[GitHub] spark pull request #21889: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21889#discussion_r209839642 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -0,0 +1,200 @@ +/* + * L

[GitHub] spark issue #22082: [SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs

2018-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22082 @dbtsai Nope. I did not hit any issue. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional co

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler @shaneknapp Thanks for your work! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For addition

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22096 **[Test build #94730 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94730/testReport)** for PR 22096 at commit [`7de2184`](https://github.com/apache/spark/commit/7d

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2168/

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22001 Just curious. It is very interesting to me since the recent three tries consistently cause a timeout failure at the same test. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBu

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22096 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: review

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21939 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2167/

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94727/ Test FAILed. ---

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21939 > great! looking forward to seeing arrow 0.10.0 come out. @cloud-fan Arrow has already been released and the artifacts are available - sorry I should have made a post to indicate that. T

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apa

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22096 **[Test build #94727 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94727/testReport)** for PR 22096 at commit [`7de2184`](https://github.com/apache/spark/commit/7

reviews@spark.apache.org

2018-08-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22099 @srowen @budde @ajfabbri --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

reviews@spark.apache.org

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22099 **[Test build #94728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94728/testReport)** for PR 22099 at commit [`e79e5b9`](https://github.com/apache/spark/commit/e7

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21939 **[Test build #94729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94729/testReport)** for PR 21939 at commit [`ae8a6aa`](https://github.com/apache/spark/commit/ae

reviews@spark.apache.org

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22099 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

reviews@spark.apache.org

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22099 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2166/

reviews@spark.apache.org

2018-08-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22099 As noted in #22146; stripping off bouncy castle and upgrading the SDK worked. But a local test run of just this patch brought up the same error seen in #22081 ``` WithoutAggregat

[GitHub] spark pull request #22099: [SPARK-25111][BUILD] increment kinesis client/pro...

2018-08-13 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/22099 [SPARK-25111][BUILD] increment kinesis client/producer & aws-sdk versions ## What changes were proposed in this pull request? Increment the kinesis client, producer and transient AWS

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94716/ Test FAILed. ---

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22001 **[Test build #94716 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94716/testReport)** for PR 22001 at commit [`9d4e232`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209835246 --- Diff: python/pyspark/taskcontext.py --- @@ -95,3 +95,92 @@ def getLocalProperty(self, key): Get a local property set upstream in the dri

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209834646 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +183,42 @@ private[spark] abstract class BasePythonRunner[IN,

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209833972 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +183,42 @@ private[spark] abstract class BasePythonRunner[IN,

[GitHub] spark issue #21146: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/21146 And if you bump up the kinesis client and AWS SDK version to 1.11.271, those failures go away. ``` Run completed in 15 minutes, 28 seconds. Total number of tests run: 59 S

[GitHub] spark issue #22071: [SPARK-25088][CORE][MESOS][DOCS] Update Rest Server docs...

2018-08-13 Thread tnachen
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/22071 LGTM as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spa

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21698 I took a quick look at the shuffle writer and feel it will be hard to insert a sort there. I have a simpler proposal for the fix. To trigger this bug, there must be a shuffle before the `

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/22096 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache

[GitHub] spark pull request #22085: [SPARK-25095][PySpark] Python support for Barrier...

2018-08-13 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/22085#discussion_r209830941 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -180,7 +183,42 @@ private[spark] abstract class BasePythonRunner[IN, OUT

[GitHub] spark issue #21146: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-13 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/21146 FYI, I just did a kinesis test run with this PR on a JVM with the unlimited JCE installed (explicitly verified by shasum of the relevant JARs); failure with cert errors. ``` succe

[GitHub] spark pull request #21889: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-13 Thread ajacques
Github user ajacques commented on a diff in the pull request: https://github.com/apache/spark/pull/21889#discussion_r209830673 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -0,0 +1,200 @@ +/* + * Lice

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21221 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94715/ Test PASSed. ---

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21221 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21221 **[Test build #94715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94715/testReport)** for PR 21221 at commit [`03cd5bc`](https://github.com/apache/spark/commit/0

[GitHub] spark pull request #20313: [SPARK-22974][ML] Attach attributes to output col...

2018-08-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20313 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20313: [SPARK-22974][ML] Attach attributes to output column of ...

2018-08-13 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/20313 LGTM. I think to have transformer framework working properly, it's required to have attributes in `CountVector`. Being said that, we should deal with the issue of allocating big attributes for sparse

[GitHub] spark issue #19633: [SPARK-22411][SQL] Disable the heuristic to calculate ma...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19633 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22094: [SPARK-25104][SQL]Avro: Validate user specified o...

2018-08-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22094 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22094: [SPARK-25104][SQL]Avro: Validate user specified output s...

2018-08-13 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/22094 Merged into master. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: re

[GitHub] spark pull request #22094: [SPARK-25104][SQL]Avro: Validate user specified o...

2018-08-13 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/22094#discussion_r209826949 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -72,62 +73,70 @@ class AvroSerializer(rootCatalystType: DataType

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21819 **[Test build #94726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94726/testReport)** for PR 21819 at commit [`ffa20ba`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94726/ Test PASSed. ---

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21939 great! looking forward to seeing arrow 0.10.0 come out. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22096 **[Test build #94727 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94727/testReport)** for PR 22096 at commit [`7de2184`](https://github.com/apache/spark/commit/7d

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2165/

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22096 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: review

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94724/ Test FAILed. ---

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22096 **[Test build #94724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94724/testReport)** for PR 22096 at commit [`7de2184`](https://github.com/apache/spark/commit/7

[GitHub] spark pull request #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22009#discussion_r209823382 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala --- @@ -76,41 +76,43 @@ object DataSourceV2Str

[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21977#discussion_r209823163 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -60,14 +61,20 @@ private[spark] object PythonEvalType { */

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21819 **[Test build #94726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94726/testReport)** for PR 21819 at commit [`ffa20ba`](https://github.com/apache/spark/commit/ff

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/21698 @squito @tgravescs I am probably missing something about why hash partitioner helps, can you please clarify ? IIRC the partitioner for CoalescedRDD when shuffle is enabled is HashPartitioner ...

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21819 cc @koeninger as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #21819: [SPARK-24863][SS] Report Kafka offset lag as a custom me...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21819 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: review

[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20637#discussion_r209822349 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -43,25 +43,29 @@ object G

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-13 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22079 @jiangxb1987 Here are some of the differences from the original PR - I also ported the follow up PR #20426 - I ported #20088 (for SPARK-22905) to get the tests to pass. I also p

[GitHub] spark pull request #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shu...

2018-08-13 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22079#discussion_r209822194 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Ap

[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94714/ Test PASSed. ---

[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22065 **[Test build #94714 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94714/testReport)** for PR 22065 at commit [`a99769d`](https://github.com/apache/spark/commit/a

[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function

2018-08-13 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r209820737 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala --- @@ -2238,6 +2238,70 @@ class DataFrameFunctionsSuite extends QueryTe

[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function

2018-08-13 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r209816384 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -496,3 +496,195 @@ case class ArrayAggrega

[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function

2018-08-13 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r209820348 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -496,3 +496,195 @@ case class ArrayAggrega

[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-08-13 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/22087 @felixcheung Testsuites is added. Thanks for reviewing! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org F

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/21698 @tgravescs I vaguely remember someone at y! labs telling me (more than a decade back) about MR always doing a sort as part of its shuffle to avoid a variant of this problem by design. Essentiall

[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94723/ Test PASSed. ---

[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22087 **[Test build #94723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94723/testReport)** for PR 22087 at commit [`5fe7ed3`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #22075: [SPARK-23908][SQL][FOLLOW-UP] Rename inputs to ar...

2018-08-13 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22075#discussion_r209816692 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -422,45 +425,49 @@ case class ArrayExists(

[GitHub] spark issue #22038: [SPARK-25056][SQL] Unify the InConversion and BinaryComp...

2018-08-13 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22038 **Teradata**: ![image](https://user-images.githubusercontent.com/5399861/44069251-778312cc-9fb0-11e8-8cf1-aa2e5f6b79d3.png) **Oracle**: ![image](https://user-images.githubuserconten

[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21561 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94718/ Test PASSed. ---

[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21561 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/GMM/AFT/...

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21561 **[Test build #94718 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94718/testReport)** for PR 21561 at commit [`5f403fa`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shu...

2018-08-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/22079#discussion_r209815627 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-13 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/22079 overall I'm in favor of backporting this, and it looks like the only changes to the original were very small, so I'm in favor of this. --- --

[GitHub] spark issue #21698: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-13 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21698 I also think @tgravescs solution of using the HashPartitioner is an acceptable one, though as you've noted it doesn't deal w/ skew (which may be a lot of the existing use of `repartition()`). I thin

[GitHub] spark issue #21451: [SPARK-24296][CORE] Replicate large blocks as a stream.

2018-08-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21451 **[Test build #94725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94725/testReport)** for PR 21451 at commit [`c45e702`](https://github.com/apache/spark/commit/c4

[GitHub] spark issue #21451: [SPARK-24296][CORE] Replicate large blocks as a stream.

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21451 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2164/

[GitHub] spark issue #21451: [SPARK-24296][CORE] Replicate large blocks as a stream.

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21451 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22096: [MINOR][SQL][DOC] Fix `to_json` example in function desc...

2018-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22096 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2163/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21320 That's more work for @ajacques though on the other hand. Either way works fine. --- - To unsubscribe, e-mail: reviews-unsubs

[GitHub] spark issue #22048: [SPARK-25108][SQL] Fix the show method to display the wi...

2018-08-13 Thread xuejianbest
Github user xuejianbest commented on the issue: https://github.com/apache/spark/pull/22048 ![df.show](https://issues.apache.org/jira/secure/attachment/12935462/show.bmp) --- - To unsubscribe, e-mail: reviews-unsubsc

  1   2   3   4   5   6   7   >