date:20180904

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Hi, @shaneknapp and @srowen . Can we build and use the zinc-installed docker images in our build system? - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/jo

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-09-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r215070653 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1513,37 +1513,34 @@ private[spark] class DAGScheduler(

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread tgravescs

Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22112 yeah you would have to be able to handle network partitioning somehow. I don't know how difficult it is but its definitely work we may not want to do here. I was trying to clarify and make sure

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread shaneknapp

Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22333 moving any parts of the spark build infrastructure to use docker is a big project and not happening in the next few months. --- ---

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread wmellouli

Github user wmellouli commented on the issue: https://github.com/apache/spark/pull/22332 @mgaido91 Thank you for your suggestion, I updated the PR name, description and sources with a new version using a parameter `atPosition` instead of a flag `atTheEnd`. Let me know what you think a

[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...

2018-09-04 Thread huaxingao

Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20442 Any more comments? @MLnick @jkbradley --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark pull request #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 247...

2018-09-04 Thread zsxwing

GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22334 [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748 ## What changes were proposed in this pull request? Revert SPARK-24863 and SPARK 24748 as per discussion in #21721. We will revisit them

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2846/

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread dbtsai

Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/21756 add @jerryshao for more feedback. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Oh, I assumed that it's already dockerized. Sorry, never mind about that @shaneknapp . And, thanks! --- - To unsubscribe,

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread HeartSaVioR

Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @zsxwing If it means code freeze for 2.4 is just around the corner then sure! We can focus on blockers for releasing 2.4, and revisit this again. Let me reflect @gaborgsomogyi review commen

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95680/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95680/ Test FAILed. ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 At this time, R failure. ``` DONE === Had test warnings or failures; see logs. ``` --- ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2847/

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95685/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3c

[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22218 **[Test build #4331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4331/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-09-04 Thread gatorsmile

Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22171 @vinodkc Could you answer the question from @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org Fo

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread ifilonenko

Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22298 @felixcheung @holdenk I have moved the PySpark example files to a more appropriate location. Any other comments before merge? --- --

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 **[Test build #95686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95686/testReport)** for PR 22298 at commit [`7dc26ce`](https://github.com/apache/spark/commit/7d

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread tigerquoll

Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21308 I am assuming this API was intended to support the "drop partition" use-case. I'm arguing that adding and deleting partitions deal with a concept that is a slightly higher concept than just a bu

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ ---

[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread gatorsmile

Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22234 Did we introduce any behavior change in https://github.com/apache/spark/pull/21273? Does this PR resolve it? --- - To unsubsc

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-04 Thread HeartSaVioR

Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r215092933 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract c

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ --- --

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2848/

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread rdblue

Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21308 @tigerquoll, what we come up with needs to work across a variety of data sources, including those like JDBC that can delete at a lower granularity than partition. For Hive tables, the partit

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95684/ Test FAILed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread zsxwing

Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h.

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2849/

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread maropu

Github user maropu commented on the issue: https://github.com/apache/spark/pull/22332 I also can't find a strong reason to append a new API in `Dataset`... btw, to add a new API there, you'd be better to discuss in jira before making a pr, I think. cc: @rxin @cloud-fan @HyukjinKwon

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll

Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 Sure, I am looking at the point of view of supporting Kudu. Check out https://kudu.apache.org/docs/schema_design.html#partitioning for some of the details. In particular https://kudu.apach

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll

Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 So Kudu range partitions support arbitrary sized partition intervals, like the example below, where the first and last range partition are six months in size, but the middle partition is one year

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22138 **[Test build #95688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95688/testReport)** for PR 22138 at commit [`9685cc5`](https://github.com/apache/spark/commit/96

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-04 Thread wangyum

Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215106921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -56,7 +56,7 @@ case class Inser

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread bomeng

Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 Here is the test code, not sure it is right or not --- ``` test("Number of partitions") { sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local") .set(

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread srowen

Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Ideally the last test should have 50 partitions? is it because we really need the test data to be at least 50 bytes? ideally a multiple of 50, I guess. --- -

[GitHub] spark issue #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in...

2018-09-04 Thread maropu

Github user maropu commented on the issue: https://github.com/apache/spark/pull/22324 ping @srowen @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder should...

2018-09-04 Thread fangshil

Github user fangshil commented on the issue: https://github.com/apache/spark/pull/21310 To summarize our discussion in this pr: Spark-avro is now merged into Spark as a built-in data source. Upstream community is not merging the AvroEncoder to support Avro types in Dataset, inste

[GitHub] spark pull request #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder...

2018-09-04 Thread fangshil

Github user fangshil closed the pull request at: https://github.com/apache/spark/pull/21310 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17174: [SPARK-19145][SQL] Timestamp to String casting is slowin...

2018-09-04 Thread hindog

Github user hindog commented on the issue: https://github.com/apache/spark/pull/17174 I believe another performance impact related to this may be attributed to the `cast` operator failing to match during filter-pushdown, meaning that the filter on the timestamp will NOT get pushed dow

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3

[GitHub] spark pull request #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFile...

2018-09-04 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22324#discussion_r215111327 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -473,6 +476,27 @@ class FileBasedDataSourceSuite extends QueryTe

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95687/ Test FAILed. ---

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs yes you are right about the problem here. Instead of asking executors to remove old committed shuffle data, I prefer #6648 , which just write new shuffle data with a different file name

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #95682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95682/testReport)** for PR 21669 at commit [`aa3779c`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95682/ Test PASSed. ---

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22333 **[Test build #95683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95683/testReport)** for PR 22333 at commit [`ca99634`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95689/testReport)** for PR 22334 at commit [`3d59df1`](https://github.com/apache/spark/commit/3d

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95683/ Test PASSed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2850/

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it'...

2018-09-04 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22333#discussion_r215115678 --- Diff: build/mvn --- @@ -91,15 +92,23 @@ install_mvn() { # Install zinc under the build/ folder install_zinc() { - local zinc_path="zi

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread maropu

Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r215115685 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql]( files.toSet.toArray

[GitHub] spark issue #22306: [SPARK-25300][CORE]Unified the configuration parameter `...

2018-09-04 Thread kiszk

Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22306 cc @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: revi

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22329 **[Test build #95690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95690/testReport)** for PR 22329 at commit [`2ad350c`](https://github.com/apache/spark/commit/2a

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2851/

[GitHub] spark issue #22328: [SPARK-22666][ML][SQL] Spark datasource for image format

2018-09-04 Thread mhamilton723

Github user mhamilton723 commented on the issue: https://github.com/apache/spark/pull/22328 @WeichenXu123. Awesome work! I have not had a chance to go through this in depth but I did this in the originating project, [MMLSpark](www.aka.ms/spark), a while back and have been meaning to s

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-09-04 Thread rxin

Github user rxin commented on the issue: https://github.com/apache/spark/pull/21721 BTW I think this is probably SPIP-worthy. At the very least we should write a design doc on this, similar to the other docs for dsv2 sub-components. We should really think about whether it'd be possibl

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95685/testReport)** for PR 22313 at commit [`3cd4443`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95685/ Test PASSed. ---

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22319 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22319 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2852/

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22319 **[Test build #95691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95691/testReport)** for PR 22319 at commit [`9e060a4`](https://github.com/apache/spark/commit/9e

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22313 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22313 @dongjoon-hyun please also update the title of the JIRA ticket, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95681/testReport)** for PR 22192 at commit [`5a2852f`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #22313: [SPARK-25306][SQL] Avoid skewed filter trees to s...

2018-09-04 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22313 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22192 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95681/ Test FAILed. ---

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22192 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95690/ Test PASSed. ---

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22329 **[Test build #95690 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95690/testReport)** for PR 22329 at commit [`2ad350c`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22329 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 Thank you, @cloud-fan . Sure. I'll update them. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For addi

[GitHub] spark pull request #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive...

2018-09-04 Thread seancxmao

Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22183 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22306: [SPARK-25300][CORE]Unified the configuration parameter `...

2018-09-04 Thread cloud-fan

Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22306 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #22306: [SPARK-25300][CORE]Unified the configuration para...

2018-09-04 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22306 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang

Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r215122865 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql]( files.toSet.to

[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22234 From my understanding, yea. The problem here is sounds like ambiguity in empty strings since they can be interpreted as empty strings and also `null`. To me, this is actually rather a bug since

[GitHub] spark pull request #22227: [SPARK-25202] [SQL] Implements split with limit s...

2018-09-04 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r215123978 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -394,12 +394,14 @@ public void substringSQL() {

[GitHub] spark pull request #22227: [SPARK-25202] [SQL] Implements split with limit s...

2018-09-04 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r215124064 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2546,15 +2546,39 @@ object functions { def soundex(e: Column): Colu

[GitHub] spark pull request #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it'...

2018-09-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22333#discussion_r215124159 --- Diff: build/mvn --- @@ -91,15 +92,23 @@ install_mvn() { # Install zinc under the build/ folder install_zinc() { - local zinc_p

[GitHub] spark issue #22329: [SPARK-25328][PYTHON] Add an example for having two colu...

2018-09-04 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22329 cc @gatorsmile and @BryanCutler --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22332 Can't we simply `select` after the the column is added? I wouldn't add this as well - it can look confusing to be honest IMO. --- --

< 1 2 3 4 5 6 >

401 - 500 of 584 matches

Mail list logo