[GitHub] [spark] SparkQA commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
SparkQA commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511724645 **[Test build #107729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107729/testReport)** for PR 25170 at commit [`9ec8e4b`](https://github.com/apache/spark/commit/9ec8e4b35064b3db36a4637509618775cbfd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times
AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times URL: https://github.com/apache/spark/pull/25164#issuecomment-511738235 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107730/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511747773 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107731/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax
beliefer commented on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax URL: https://github.com/apache/spark/pull/25001#issuecomment-511748248 > yea, I'll check in a day and just a sec. This answer is very interesting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511747752 **[Test build #107731 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107731/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511753204 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107732/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax
AmplabJenkins commented on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax URL: https://github.com/apache/spark/pull/25074#issuecomment-511753372 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax
AmplabJenkins commented on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax URL: https://github.com/apache/spark/pull/25074#issuecomment-511753380 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12858/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511757383 Created a quick fix https://github.com/apache/spark/pull/25171 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command
SparkQA removed a comment on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command URL: https://github.com/apache/spark/pull/24759#issuecomment-511693975 **[Test build #107723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107723/testReport)** for PR 24759 at commit [`01c6655`](https://github.com/apache/spark/commit/01c6655f14c77f3d8ff02798dba070e13ed603bf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command
SparkQA commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command URL: https://github.com/apache/spark/pull/24759#issuecomment-511757320 **[Test build #107723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107723/testReport)** for PR 24759 at commit [`01c6655`](https://github.com/apache/spark/commit/01c6655f14c77f3d8ff02798dba070e13ed603bf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
AmplabJenkins commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511759567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107725/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix
AmplabJenkins removed a comment on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix URL: https://github.com/apache/spark/pull/25171#issuecomment-511758170 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
AmplabJenkins commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511759560 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
SparkQA removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511696164 **[Test build #107725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107725/testReport)** for PR 24931 at commit [`f49cf43`](https://github.com/apache/spark/commit/f49cf432392118137a16fe5619b55dfa0520262c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hvanhovell commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix
hvanhovell commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix URL: https://github.com/apache/spark/pull/25171#issuecomment-511759662 @gaborgsomogyi thanks for the PR! I will merge this as soon as compilation has finished. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
SparkQA commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511759751 **[Test build #107727 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107727/testReport)** for PR 25168 at commit [`ac20743`](https://github.com/apache/spark/commit/ac20743bf09d6a976f632c586da683220ff8bdf5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
AmplabJenkins removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511759560 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
AmplabJenkins removed a comment on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511759567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107725/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ozancicek commented on a change in pull request #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions
ozancicek commented on a change in pull request #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions URL: https://github.com/apache/spark/pull/24939#discussion_r303864360 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala ## @@ -362,6 +381,16 @@ class RFormulaModel private[feature]( case _ => true } StructType(withFeatures.fields :+ StructField($(labelCol), DoubleType, nullable)) +} else if (resolvedFormula.evalExprs.contains(resolvedFormula.label)) { + val spark = SparkSession.builder().getOrCreate() + val dummyRDD = spark.sparkContext.parallelize(Seq(Row.empty)) + val dummyDF = spark.createDataFrame(dummyRDD, schema) +.withColumn(resolvedFormula.label, expr(resolvedFormula.label)) + val nullable = dummyDF.schema(resolvedFormula.label).dataType match { Review comment: In order to create a schema with arbitrary types, I actually borrowed this pattern from here: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala#L79 However, the datatype check to determine whether the field should be nullable seems unnecessary here, because only `NumericType` and `BooleanType` seems to be supported, so I modified here to return schema without checking datatype. https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala#L390 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint
maropu commented on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint URL: https://github.com/apache/spark/pull/25071#issuecomment-511816502 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #25154: [SPARK-28247][SS][BRANCH-2.4] Fix flaky test "query without test harness" on ContinuousSuite
srowen closed pull request #25154: [SPARK-28247][SS][BRANCH-2.4] Fix flaky test "query without test harness" on ContinuousSuite URL: https://github.com/apache/spark/pull/25154 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
maropu commented on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#issuecomment-511816332 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on issue #25154: [SPARK-28247][SS][BRANCH-2.4] Fix flaky test "query without test harness" on ContinuousSuite
srowen commented on issue #25154: [SPARK-28247][SS][BRANCH-2.4] Fix flaky test "query without test harness" on ContinuousSuite URL: https://github.com/apache/spark/pull/25154#issuecomment-511816391 Merged to 2.4 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #21861: [SPARK-24907][SQL][WIP] Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API
gaborgsomogyi commented on issue #21861: [SPARK-24907][SQL][WIP] Migrate JDBC DataSource to JDBCDataSourceV2 Read using DataSourceV2 API URL: https://github.com/apache/spark/pull/21861#issuecomment-511722470 @shivsood thanks for considering me as collaborator. Since I'm mainly in the streaming area my main intention is to implement Structured Streaming source/sink based on the SQL implementation. Additionally you've guys already picked this up so I'm contributing in this effort with review comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully
cloud-fan commented on issue #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully URL: https://github.com/apache/spark/pull/25167#issuecomment-511735270 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times
AmplabJenkins commented on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times URL: https://github.com/apache/spark/pull/25164#issuecomment-511735714 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12857/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#discussion_r303804331 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala ## @@ -419,6 +416,19 @@ private[kafka010] class KafkaOffsetReader( stopConsumer() _consumer = null // will automatically get reinitialized again } + + private def getPartitions(): ju.Set[TopicPartition] = { +var partitions = Set.empty[TopicPartition].asJava +val startTimeMs = System.currentTimeMillis() +while (partitions.isEmpty && System.currentTimeMillis() - startTimeMs < pollTimeoutMs) { + // Poll to get the latest assigned partitions + consumer.poll(jt.Duration.ofMillis(100)) Review comment: Please correct me if I'm missing here. (Not an expert of Kafka, may miss some details.) It may return after 100ms if there's no record to consume even metadata is ready. Here we only need metadata but once we call poll, the request is bound to the records. To be clear, we would like to call `poll(0)` with explicitly putting sleep (to avoid coupling with records), but it would be also OK to let `consumer.poll` wait instead. Explicit sleep may sleep more properly if there's a case record is ready to poll but metadata is not ready (I'm not sure this can be possible). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#discussion_r303804331 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala ## @@ -419,6 +416,19 @@ private[kafka010] class KafkaOffsetReader( stopConsumer() _consumer = null // will automatically get reinitialized again } + + private def getPartitions(): ju.Set[TopicPartition] = { +var partitions = Set.empty[TopicPartition].asJava +val startTimeMs = System.currentTimeMillis() +while (partitions.isEmpty && System.currentTimeMillis() - startTimeMs < pollTimeoutMs) { + // Poll to get the latest assigned partitions + consumer.poll(jt.Duration.ofMillis(100)) Review comment: Please correct me if I'm missing here. (Not an expert of Kafka, may miss some details.) It may return after 100ms if there's no record to consume even metadata is ready. Here we only need metadata but once we call poll, the request is bound to the records. To be clear, we would like to call `poll(0)` with explicitly putting sleep (to avoid coupling with records), but it would be also OK to let `consumer.poll` wait instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#discussion_r303804331 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala ## @@ -419,6 +416,19 @@ private[kafka010] class KafkaOffsetReader( stopConsumer() _consumer = null // will automatically get reinitialized again } + + private def getPartitions(): ju.Set[TopicPartition] = { +var partitions = Set.empty[TopicPartition].asJava +val startTimeMs = System.currentTimeMillis() +while (partitions.isEmpty && System.currentTimeMillis() - startTimeMs < pollTimeoutMs) { + // Poll to get the latest assigned partitions + consumer.poll(jt.Duration.ofMillis(100)) Review comment: Please correct me if I'm missing here. (Not an expert of Kafka, may miss some details.) It may return after 100ms if there's no record to consume even metadata is ready. Here we only need metadata but once we call poll, the request is bound to the records. To be clear, we would also like to call `poll(0)` with explicitly putting sleep (to avoid coupling with records), but it would be also OK to let `consumer.poll` wait instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] priyankagargnitk commented on issue #25022: [SPARK-24695][SQL]: To expose Calendar interval type, so that same can be returned from the UDF.
priyankagargnitk commented on issue #25022: [SPARK-24695][SQL]: To expose Calendar interval type, so that same can be returned from the UDF. URL: https://github.com/apache/spark/pull/25022#issuecomment-511735997 @dongjoon-hyun @HyukjinKwon @rxin: All the review comments are incorporated. Please have a look now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
HeartSaVioR commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511744146 LGTM again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type
maropu commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type URL: https://github.com/apache/spark/pull/25137#issuecomment-511744139 Which one does follow the SQL standard? IIUC the current spark behaviour depends on the Hive one. On the other hand, PostgreSQL officially says [they follows the standard of "Implicit casting among the numeric data types](https://www.postgresql.org/docs/11/features-sql-standard.html) and the result is; ``` postgres=# select cast(c1 * cast(-34338492.215397047 as decimal(38, 18)) as decimal(38, 18)) as c1 from spark_28348; c1 - 1179132047626883.596862135856320209 (1 row) postgres=# explain verbose select cast(c1 * cast(-34338492.215397047 as decimal(38, 18)) as decimal(38, 18)) as c1 from spark_28348; QUERY PLAN --- Seq Scan on public.spark_28348 (cost=0.00..31.00 rows=1400 width=30) Output: ((c1 * '-34338492.2153970470'::numeric(38,18)))::numeric(38,18) (2 rows) ``` mysql has the same result; ``` mysql> select cast(c1 * cast(-34338492.215397047 as decimal(38, 18)) as decimal(38, 18)) as c1 from spark_28348; +-+ | c1 | +-+ | 1179132047626883.596862135856320209 | +-+ ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511745312 **[Test build #107731 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107731/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511747773 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107731/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22054: [SPARK-24703][SQL]: To add support to multiply CalendarInterval with Integral Type.
AmplabJenkins removed a comment on issue #22054: [SPARK-24703][SQL]: To add support to multiply CalendarInterval with Integral Type. URL: https://github.com/apache/spark/pull/22054#issuecomment-511750701 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107728/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql
SparkQA commented on issue #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931#issuecomment-511758992 **[Test build #107725 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107725/testReport)** for PR 24931 at commit [`f49cf43`](https://github.com/apache/spark/commit/f49cf432392118137a16fe5619b55dfa0520262c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511771601 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511784040 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511773381 **[Test build #107735 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107735/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511784046 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107735/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511783995 **[Test build #107735 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107735/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511784040 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions
SparkQA commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions URL: https://github.com/apache/spark/pull/24939#issuecomment-511798279 **[Test build #107736 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107736/testReport)** for PR 24939 at commit [`9b1d1e4`](https://github.com/apache/spark/commit/9b1d1e4f1ded80c18e65eaa422acbf1a356364f6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang edited a comment on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type
gengliangwang edited a comment on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type URL: https://github.com/apache/spark/pull/25137#issuecomment-511815571 @maropu I just checked and found PostgreSQL and MySQL has bigger precision than Spark's(max precision as 38): > The maximum allowed precision when explicitly specified in the type declaration is 1000; https://www.postgresql.org/docs/10/datatype-numeric.html > M is the maximum number of digits (the precision). It has a range of 1 to 65. https://dev.mysql.com/doc/refman/5.7/en/precision-math-decimal-characteristics.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount
SparkQA commented on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount URL: https://github.com/apache/spark/pull/24989#issuecomment-511815544 **[Test build #107738 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107738/testReport)** for PR 24989 at commit [`6e86ab9`](https://github.com/apache/spark/commit/6e86ab9b9db12a88998e54d6358d3ac6143e957a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type
gengliangwang commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type URL: https://github.com/apache/spark/pull/25137#issuecomment-511815571 @maropu I just checked and find PostgreSQL and MySQL has bigger precision than Spark's(max precision as 38): > The maximum allowed precision when explicitly specified in the type declaration is 1000; https://www.postgresql.org/docs/10/datatype-numeric.html > M is the maximum number of digits (the precision). It has a range of 1 to 65. https://dev.mysql.com/doc/refman/5.7/en/precision-math-decimal-characteristics.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount
SparkQA removed a comment on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount URL: https://github.com/apache/spark/pull/24989#issuecomment-511815544 **[Test build #107738 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107738/testReport)** for PR 24989 at commit [`6e86ab9`](https://github.com/apache/spark/commit/6e86ab9b9db12a88998e54d6358d3ac6143e957a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount
AmplabJenkins removed a comment on issue #24989: [SPARK-27959][YARN][test-hadoop3.2] Change YARN resource configs to use .amount URL: https://github.com/apache/spark/pull/24989#issuecomment-511823339 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107738/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
AmplabJenkins removed a comment on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511726821 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12856/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
AmplabJenkins commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511726821 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12856/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
AmplabJenkins commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511726809 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
HyukjinKwon commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511726635 Looks good .. but let's get #25130 first .. just in case that causes some changes in those PRs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #24931: [SPARK-28129][SQL][TEST] Port float8.sql
HyukjinKwon closed pull request #24931: [SPARK-28129][SQL][TEST] Port float8.sql URL: https://github.com/apache/spark/pull/24931 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM
srowen commented on issue #25133: [SPARK-28365][ML] Fallback locale to en_US in StopWordsRemover if system default locale isn't in available locales in JVM URL: https://github.com/apache/spark/pull/25133#issuecomment-511785224 Yeah, I'm just wondering - wouldn't this cause a hundred other problems with JVM-based apps? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
SparkQA commented on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511785259 **[Test build #107729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107729/testReport)** for PR 25170 at commit [`9ec8e4b`](https://github.com/apache/spark/commit/9ec8e4b35064b3db36a4637509618775cbfd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
SparkQA removed a comment on issue #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170#issuecomment-511724645 **[Test build #107729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107729/testReport)** for PR 25170 at commit [`9ec8e4b`](https://github.com/apache/spark/commit/9ec8e4b35064b3db36a4637509618775cbfd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25171: [SPARK-27485][FOLLOWUP] Do not reduce the number of partitions for repartition in adaptive execution - fix compilation
cloud-fan commented on issue #25171: [SPARK-27485][FOLLOWUP] Do not reduce the number of partitions for repartition in adaptive execution - fix compilation URL: https://github.com/apache/spark/pull/25171#issuecomment-511791231 cool, thanks for the fix! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25118: [SPARK-27878][SQL] Support ARRAY(subquery) expressions
maropu commented on issue #25118: [SPARK-27878][SQL] Support ARRAY(subquery) expressions URL: https://github.com/apache/spark/pull/25118#issuecomment-511807478 Thanks. This is just my opinion and the other developers might not think so. Its ok to keep it open for a while. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
SparkQA commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511809713 **[Test build #107737 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107737/testReport)** for PR 25172 at commit [`d60ff2f`](https://github.com/apache/spark/commit/d60ff2f499cecd921b5bf36d3704c83265e66ee9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
HeartSaVioR commented on a change in pull request #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#discussion_r303792433 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala ## @@ -419,6 +416,19 @@ private[kafka010] class KafkaOffsetReader( stopConsumer() _consumer = null // will automatically get reinitialized again } + + private def getPartitions(): ju.Set[TopicPartition] = { +var partitions = Set.empty[TopicPartition].asJava +val startTimeMs = System.currentTimeMillis() +while (partitions.isEmpty && System.currentTimeMillis() - startTimeMs < pollTimeoutMs) { + // Poll to get the latest assigned partitions + consumer.poll(jt.Duration.ofMillis(100)) Review comment: No you just need to start the method with below: ``` consumer.poll(jt.Duration.Zero) var partitions = consumer.assignment() ``` instead of initializing partitions as empty set. If `consumer.poll(0)` works and partitions are filled with assignment, loop will not be executed. Does it work for you? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully
cloud-fan closed pull request #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully URL: https://github.com/apache/spark/pull/25167 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
SparkQA commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511740298 **[Test build #107724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107724/testReport)** for PR 25168 at commit [`af80ac7`](https://github.com/apache/spark/commit/af80ac7c2f468d00fb2ffd040abe3f4fa0bed762). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511753199 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax
AmplabJenkins removed a comment on issue #25074: [SPARK-27924]Support ANSI SQL Boolean-Predicate syntax URL: https://github.com/apache/spark/pull/25074#issuecomment-511753372 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511751256 **[Test build #107732 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107732/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511753199 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-511753177 **[Test build #107732 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107732/testReport)** for PR 25135 at commit [`2968c2a`](https://github.com/apache/spark/commit/2968c2ad204f6a045cbc2df89a8aa581fc7cf1cf). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix
AmplabJenkins commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix URL: https://github.com/apache/spark/pull/25171#issuecomment-511758170 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix
AmplabJenkins removed a comment on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix URL: https://github.com/apache/spark/pull/25171#issuecomment-511757995 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
AmplabJenkins commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511760337 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
AmplabJenkins commented on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511760343 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107727/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
SparkQA removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511700709 **[Test build #107727 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107727/testReport)** for PR 25168 at commit [`ac20743`](https://github.com/apache/spark/commit/ac20743bf09d6a976f632c586da683220ff8bdf5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
AmplabJenkins removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511760337 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base
AmplabJenkins removed a comment on issue #25168: [SPARK-28276][SQL][PYTHON][TEST] Convert and port 'cross-join.sql' into UDF test base URL: https://github.com/apache/spark/pull/25168#issuecomment-511760343 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107727/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
AmplabJenkins commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511781742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12860/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
AmplabJenkins commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511781738 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu edited a comment on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax
maropu edited a comment on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax URL: https://github.com/apache/spark/pull/25001#issuecomment-511803526 Also, could you check the query below? Then, plz add some tests in `SQLQueryTestSuite`. ``` scala> sql("""select * from t8 where a like b escape '"' """).show org.apache.spark.sql.catalyst.parser.ParseException: Invalid escape string.Escape string must be empty or one character.(line 1, pos 25) == SQL == select * from t8 where a like b escape '"' -^^^ ``` ``` postgres=# select * from t8 where a like b escape '"'; a | b ---+--- (0 rows) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
AmplabJenkins removed a comment on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#issuecomment-511818135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12863/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint
AmplabJenkins removed a comment on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint URL: https://github.com/apache/spark/pull/25071#issuecomment-511818025 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12862/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint
AmplabJenkins removed a comment on issue #25071: [SPARK-28292][SQL] Enable inject user-defined Hint URL: https://github.com/apache/spark/pull/25071#issuecomment-511818016 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command
AmplabJenkins removed a comment on issue #24903: [SPARK-28084][SQL] Resolving the partition column name based on the resolver in sql load command URL: https://github.com/apache/spark/pull/24903#issuecomment-511818127 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type
gengliangwang commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type URL: https://github.com/apache/spark/pull/25137#issuecomment-511822926 I am actually slightly -1 with this proposal. If this is OK, in following case ``` scala> Seq(2147483647).toDF("c").createOrReplaceTempView("foobar") scala> spark.sql("select cast(c*c as long) from foobar").show() +---+ |CAST((c * c) AS BIGINT)| +---+ | 1| +---+ ``` We might need to have a similar new rule to convert the SQL statement as ``` spark.sql("select cast(c as long)*cast(c as Long) from foobar").show() +---+ |(CAST(c AS BIGINT) * CAST(c AS BIGINT))| +---+ |4611686014132420609| +---+ ``` I also tried PostgreSQL ``` create table t(i int); insert into t values(2147483647); select cast(i*i as bigint) from t; ``` And it shows error `integer out of range`. So, for now, I don't think there is such optimization in PostgreSQL. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum opened a new pull request #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing
wangyum opened a new pull request #25170: [SPARK-28343][FOLLOW-UP][SQL][TEST] Enable spark.sql.function.preferIntegralDivision for PostgreSQL testing URL: https://github.com/apache/spark/pull/25170 ## What changes were proposed in this pull request? This PR enables `spark.sql.function.preferIntegralDivision` for PostgreSQL testing. ## How was this patch tested? N/A This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully
cloud-fan commented on issue #25167: [SPARK-27485] EnsureRequirements.reorder should handle duplicate expressions gracefully URL: https://github.com/apache/spark/pull/25167#issuecomment-511736945 ah we probably need to fix it in 2.4 as well, @dongjoon-hyun can you help to backport? thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times
AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times URL: https://github.com/apache/spark/pull/25164#issuecomment-511735709 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times
AmplabJenkins removed a comment on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times URL: https://github.com/apache/spark/pull/25164#issuecomment-511735714 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12857/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] priyankagargnitk commented on issue #22054: [SPARK-24703][SQL]: To add support to multiply CalendarInterval with Integral Type.
priyankagargnitk commented on issue #22054: [SPARK-24703][SQL]: To add support to multiply CalendarInterval with Integral Type. URL: https://github.com/apache/spark/pull/22054#issuecomment-511736337 @hvanhovell @dongjoon-hyun: All the review comments are incorporated. Please have a look now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times
SparkQA commented on issue #25164: [SPARK-28375] Prevent the PullupCorrelatedPredicates optimizer rule from removing predicates if run multiple times URL: https://github.com/apache/spark/pull/25164#issuecomment-511736419 **[Test build #107730 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107730/testReport)** for PR 25164 at commit [`c898430`](https://github.com/apache/spark/commit/c8984302ab123fb0df2a8579dd9a16e1b318547d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type
maropu commented on issue #25137: [SPARK-28348][SQL] Decimal precision promotion for binary arithmetic with casted decimal type URL: https://github.com/apache/spark/pull/25137#issuecomment-511744926 @cloud-fan The current behaiovur comes from [DecimalType.adjustPrecisionScale](https://github.com/apache/spark/blob/d1a137602954a9c48fad8df929699eb4fd0f57ce/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala#L157) and it seems that follows the Hive result. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax
maropu commented on issue #25001: [SPARK-28083][SQL] Support LIKE ... ESCAPE syntax URL: https://github.com/apache/spark/pull/25001#issuecomment-511745351 yea, I'll check in a day and just a sec. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix
gaborgsomogyi commented on issue #25171: [SPARK-27485][FOLLOWUP] Quick fix URL: https://github.com/apache/spark/pull/25171#issuecomment-511757137 cc @cloud-fan @hvanhovell This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command
SparkQA commented on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command URL: https://github.com/apache/spark/pull/24759#issuecomment-511763648 **[Test build #107726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107726/testReport)** for PR 24759 at commit [`01c6655`](https://github.com/apache/spark/commit/01c6655f14c77f3d8ff02798dba070e13ed603bf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command
SparkQA removed a comment on issue #24759: [SPARK-27395][SQL][WIP] Improve EXPLAIN command URL: https://github.com/apache/spark/pull/24759#issuecomment-511696191 **[Test build #107726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107726/testReport)** for PR 24759 at commit [`01c6655`](https://github.com/apache/spark/commit/01c6655f14c77f3d8ff02798dba070e13ed603bf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions
AmplabJenkins commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions URL: https://github.com/apache/spark/pull/24939#issuecomment-511775069 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions
AmplabJenkins commented on issue #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions URL: https://github.com/apache/spark/pull/24939#issuecomment-511775076 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12859/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ozancicek commented on a change in pull request #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions
ozancicek commented on a change in pull request #24939: [SPARK-18569][ML][R] Support RFormula arithmetic, I() and spark functions URL: https://github.com/apache/spark/pull/24939#discussion_r303856796 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala ## @@ -614,3 +652,80 @@ private object VectorAttributeRewriter extends MLReadable[VectorAttributeRewrite } } } + +/** + * Utility transformer for adding expressions to dataframe using `expr` spark function + * + * @param exprsToSelect set of string expressions to be added as a column to the dataframe. + * The name of the columns will be identical to the expression + */ +private class ExprSelector( Review comment: As you suspect, having an extra hidden stage isn't really essential here. I only added it to have less coupling between RFormula and RFormulaModel classes. This can be very well done without it. Roughly, this is what RFormula and RFormulaModel classes are doing; ```scala class RFormula(..., formula) { def fit(df) = { val parsedFormula = parse(formula) var stages = ArrayBuffer() val featureColumns = parsedFormula.terms.map { ... stages += OneHotEncoder() ... } stages += VectorAssembler(featureColumns) val pipeline = Pipeline(stages) RFormulaModel(parsedFormula, pipeline) } } class RFormulaModel(parsedFormula, pipeline) { def transform(df) = { val withFeatures = pipeline.transform(df) transformLabel(withFeatures) } } ``` In order to assemble arithmetic expressions in a feature column with `VectorAssembler`, the dataframe which is transformed by `RFormulaModel` needs to have these columns. One way would be to simply add these transformations inside `RFormulaModel.transform` method, or another way would be to add a pipelined stage inside `RFormula.fit` method. Seeing that all feature column related transformations are done at `RFormula.fit` method, I chose to add a pipelined stage inside `RFormula.fit` method. But indeed, having a transformer for just executing a couple of `expr` functions could be too much. If you think it's unnecessary to add an extra stage, let me know and I'll move it's transformations to `RFormulaModel` class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
AmplabJenkins removed a comment on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511781742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12860/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
SparkQA commented on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511782308 **[Test build #107737 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107737/testReport)** for PR 25172 at commit [`d60ff2f`](https://github.com/apache/spark/commit/d60ff2f499cecd921b5bf36d3704c83265e66ee9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array
AmplabJenkins removed a comment on issue #25172: [SPARK-28412][SQL]ANSI SQL: OVERLAY function support byte array URL: https://github.com/apache/spark/pull/25172#issuecomment-511781429 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org