[GitHub] spark pull request #22313: [SPARK-25306][SQL] Avoid skewed filter trees to s...

2018-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22313 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95681/testReport)** for PR 22192 at commit [`5a2852f`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22313 @dongjoon-hyun please also update the title of the JIRA ticket, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22313 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22319 **[Test build #95691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95691/testReport)** for PR 22319 at commit [`9e060a4`](https://github.com/apache/spark/commit/9e

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2852/

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95685/ Test PASSed. ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95685/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-09-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21721 BTW I think this is probably SPIP-worthy. At the very least we should write a design doc on this, similar to the other docs for dsv2 sub-components. We should really think about whether it'd be possibl

[GitHub] spark issue #22328: [SPARK-22666][ML][SQL] Spark datasource for image format

2018-09-04 Thread mhamilton723
Github user mhamilton723 commented on the issue: https://github.com/apache/spark/pull/22328 @WeichenXu123. Awesome work! I have not had a chance to go through this in depth but I did this in the originating project, [MMLSpark](www.aka.ms/spark), a while back and have been meaning to s

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2851/

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22329 **[Test build #95690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95690/testReport)** for PR 22329 at commit [`2ad350c`](https://github.com/apache/spark/commit/2a

[GitHub] spark issue #22306: [SPARK-25300][CORE]Unified the configuration parameter `...

2018-09-04 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22306 cc @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: revi

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r215115685 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql]( files.toSet.toArray

[GitHub] spark pull request #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it'...

2018-09-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22333#discussion_r215115678 --- Diff: build/mvn --- @@ -91,15 +92,23 @@ install_mvn() { # Install zinc under the build/ folder install_zinc() { - local zinc_path="zi

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2850/

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95683/ Test PASSed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95689/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22333 **[Test build #95683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95683/testReport)** for PR 22333 at commit [`ca99634`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95682/ Test PASSed. ---

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #95682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95682/testReport)** for PR 21669 at commit [`aa3779c`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs yes you are right about the problem here. Instead of asking executors to remove old committed shuffle data, I prefer #6648 , which just write new shuffle data with a different file name

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95687/ Test FAILed. ---

[GitHub] spark pull request #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFile...

2018-09-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22324#discussion_r215111327 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -473,6 +476,27 @@ class FileBasedDataSourceSuite extends QueryTe

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #17174: [SPARK-19145][SQL] Timestamp to String casting is slowin...

2018-09-04 Thread hindog
Github user hindog commented on the issue: https://github.com/apache/spark/pull/17174 I believe another performance impact related to this may be attributed to the `cast` operator failing to match during filter-pushdown, meaning that the filter on the timestamp will NOT get pushed dow

[GitHub] spark issue #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder should...

2018-09-04 Thread fangshil
Github user fangshil commented on the issue: https://github.com/apache/spark/pull/21310 To summarize our discussion in this pr: Spark-avro is now merged into Spark as a built-in data source. Upstream community is not merging the AvroEncoder to support Avro types in Dataset, inste

[GitHub] spark pull request #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder...

2018-09-04 Thread fangshil
Github user fangshil closed the pull request at: https://github.com/apache/spark/pull/21310 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in...

2018-09-04 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22324 ping @srowen @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Ideally the last test should have 50 partitions? is it because we really need the test data to be at least 50 bytes? ideally a multiple of 50, I guess. --- -

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread bomeng
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 Here is the test code, not sure it is right or not --- ``` test("Number of partitions") { sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local") .set(

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-04 Thread wangyum
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215106921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -56,7 +56,7 @@ case class Inser

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22138 **[Test build #95688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95688/testReport)** for PR 22138 at commit [`9685cc5`](https://github.com/apache/spark/commit/96

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 So Kudu range partitions support arbitrary sized partition intervals, like the example below, where the first and last range partition are six months in size, but the middle partition is one year

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 Sure, I am looking at the point of view of supporting Kudu. Check out https://kudu.apache.org/docs/schema_design.html#partitioning for some of the details. In particular https://kudu.apach

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22332 I also can't find a strong reason to append a new API in `Dataset`... btw, to add a new API there, you'd be better to discuss in jira before making a pr, I think. cc: @rxin @cloud-fan @HyukjinKwon

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2849/

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h.

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95684/ Test FAILed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21308 @tigerquoll, what we come up with needs to work across a variety of data sources, including those like JDBC that can delete at a lower granularity than partition. For Hive tables, the partit

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ --- --

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2848/

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r215092933 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract c

[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22234 Did we introduce any behavior change in https://github.com/apache/spark/pull/21273? Does this PR resolve it? --- - To unsubsc

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ ---

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21308 I am assuming this API was intended to support the "drop partition" use-case. I'm arguing that adding and deleting partitions deal with a concept that is a slightly higher concept than just a bu

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 **[Test build #95686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95686/testReport)** for PR 22298 at commit [`7dc26ce`](https://github.com/apache/spark/commit/7d

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22298 @felixcheung @holdenk I have moved the PySpark example files to a more appropriate location. Any other comments before merge? --- --

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22171 @vinodkc Could you answer the question from @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org Fo

[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22218 **[Test build #4331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4331/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95685/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3c

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2847/

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 At this time, R failure. ``` DONE === Had test warnings or failures; see logs. ``` --- ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95680/ Test FAILed. ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95680/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @zsxwing If it means code freeze for 2.4 is just around the corner then sure! We can focus on blockers for releasing 2.4, and revisit this again. Let me reflect @gaborgsomogyi review commen

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Oh, I assumed that it's already dockerized. Sorry, never mind about that @shaneknapp . And, thanks! --- - To unsubscribe,

[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/21756 add @jerryshao for more feedback. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2846/

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark pull request #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 247...

2018-09-04 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22334 [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748 ## What changes were proposed in this pull request? Revert SPARK-24863 and SPARK 24748 as per discussion in #21721. We will revisit them

[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...

2018-09-04 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20442 Any more comments? @MLnick @jkbradley --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread wmellouli
Github user wmellouli commented on the issue: https://github.com/apache/spark/pull/22332 @mgaido91 Thank you for your suggestion, I updated the PR name, description and sources with a new version using a parameter `atPosition` instead of a flag `atTheEnd`. Let me know what you think a

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22333 moving any parts of the spark build infrastructure to use docker is a big project and not happening in the next few months. --- ---

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22112 yeah you would have to be able to handle network partitioning somehow. I don't know how difficult it is but its definitely work we may not want to do here. I was trying to clarify and make sure

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r215070653 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1513,37 +1513,34 @@ private[spark] class DAGScheduler(

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22333 **[Test build #95683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95683/testReport)** for PR 22333 at commit [`ca99634`](https://github.com/apache/spark/commit/ca

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Hi, @shaneknapp and @srowen . Can we build and use the zinc-installed docker images in our build system? - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/jo

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22145: [SPARK-25152][K8S] Enable SparkR Integration Tests for K...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22145 this PR is waiting on @shaneknapp to migrate to ubuntu and have R setup in the node responsible for distribution building. This was planning on being done right after the 2.4 cut. --- ---

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2844/

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2845/

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2844/ --- --

[GitHub] spark pull request #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it'...

2018-09-04 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22333 [SPARK-25335][BUILD] Skip Zinc downloading if it's installed in the system ## What changes were proposed in this pull request? Zinc is 23.5MB. ``` $ curl -LO https://downloads

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2844/ ---

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22112 > So in order to fix that we would need a way to tell the executors to remove that older committed shuffle data @tgravescs It is also hard to implement such a robust solution for removing

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21669 This PR has been tested and passed on a local cluster with an integration test that will be merged in a follow-up PR. It passes all three configuration options. It is now in a state that is ready

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #95682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95682/testReport)** for PR 21669 at commit [`aa3779c`](https://github.com/apache/spark/commit/aa

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95681/testReport)** for PR 22192 at commit [`5a2852f`](https://github.com/apache/spark/commit/5a

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22192 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: revi

[GitHub] spark pull request #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spar...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on a diff in the pull request: https://github.com/apache/spark/pull/21669#discussion_r215057079 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -164,7 +164,15 @@ private[spark] class SparkSubmit extends Logging {

[GitHub] spark pull request #22209: [SPARK-24415][Core] Fixed the aggregated stage me...

2018-09-04 Thread ankuriitg
Github user ankuriitg commented on a diff in the pull request: https://github.com/apache/spark/pull/22209#discussion_r215056633 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala --- @@ -1190,6 +1190,61 @@ class AppStatusListenerSuite extends SparkFu

<    1   2   3   4   5   6   >