[GitHub] [spark] imback82 commented on a change in pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
imback82 commented on a change in pull request #32854: URL: https://github.com/apache/spark/pull/32854#discussion_r648877170 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -298,6 +298,7 @@ class Analyzer(override val catalogManager: CatalogManager) Batch("Post-Hoc Resolution", Once, Seq(ResolveCommandsWithIfExists) ++ postHocResolutionRules: _*), +Batch("Normalize Alter Table Commands", Once, ResolveAlterTableCommands), Batch("Normalize Alter Table", Once, ResolveAlterTableChanges), Review comment: We can remove `ResolveAlterTableChanges` once all the alter table commands are migrated to `ResolveAlterTableCommands`. ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -3540,6 +3541,32 @@ class Analyzer(override val catalogManager: CatalogManager) } } + /** + * Rule to mostly resolve, normalize and rewrite column names based on case sensitivity + * for alter table commands. + */ + object ResolveAlterTableCommands extends Rule[LogicalPlan] { +def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { + case a @ AlterTableDropColumns(r: ResolvedTable, colsToDrop) => +val resolvedColsToDrop = colsToDrop.flatMap { col => + resolveFieldNames(r.schema, col).orElse(Some(col)) +} +a.copy(columnsToDrop = resolvedColsToDrop) +} + +/** + * Returns the resolved field name if the field can be resolved, returns None if the column is + * not found. An error will be thrown in CheckAnalysis for columns that can't be resolved. + */ +private def resolveFieldNames( Review comment: This is a copied/modified version of https://github.com/apache/spark/blob/cadd3a0588eeed42c6742ae1b7a2eaa85bd8a3af/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L3687 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ## @@ -694,6 +697,24 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog { plan.setAnalyzed() } + /** + * Find the given field name in the resolved table's schema for alter table commands. + */ + private def findField( Review comment: This is a copied/modified version of https://github.com/apache/spark/blob/cadd3a0588eeed42c6742ae1b7a2eaa85bd8a3af/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
imback82 commented on pull request #32854: URL: https://github.com/apache/spark/pull/32854#issuecomment-858337156 @cloud-fan this is an alternative approach suggested in https://github.com/apache/spark/pull/32542#issuecomment-843759217. We can migrate the alter table commands one by one via this approach. Let me know if you want to see more examples of this for other alter table commands. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 opened a new pull request #32855: [SPARK-34524][SQL][FOLLOWUP] Remove unused checkAlterTablePartition in CheckAnalysis.scala
imback82 opened a new pull request #32855: URL: https://github.com/apache/spark/pull/32855 ### What changes were proposed in this pull request? #31637 removed the usage of `CheckAnalysis.checkAlterTablePartition` but didn't remove the function. ### Why are the changes needed? To removed an unused function. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32815: [SPARK-35675][SQL] EnsureRequirements remove shuffle should respect PartitioningCollection
SparkQA commented on pull request #32815: URL: https://github.com/apache/spark/pull/32815#issuecomment-858353315 **[Test build #139605 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139605/testReport)** for PR 32815 at commit [`7caddac`](https://github.com/apache/spark/commit/7caddac6702822b12778a958b87c0d65bbb00c96). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Ngone51 commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark
Ngone51 commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-858375487 > The mapping could be between specific resources (e.g. PVC) and task (i.e. state store). Your rephrase looks good except for one point here. "task (i.e. state store)"? You mean task is kind of a type of state store? is it a typo? I actually expect that it's a mapping between PVC and task Id. > Actually we cannot schedule a task to specific location of statestore. I don't understand this. I assume each statestore must be bound to a specific location. Why we can't schedule the task? > Maybe `ResourceLocation`? It means the task prefers a location with specific resource (e.g. PVC). `ResourceLocation` sounds too general. Mabye, `RequiredResourceLocation`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32857: [SPARK-35707][ML] optimize sparse GEMM by skipping bound checking
SparkQA commented on pull request #32857: URL: https://github.com/apache/spark/pull/32857#issuecomment-858386706 **[Test build #139622 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139622/testReport)** for PR 32857 at commit [`7e8f195`](https://github.com/apache/spark/commit/7e8f195fc056ce39b5032fb85706615ada1f66c4). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
SparkQA commented on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858386662 **[Test build #139623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139623/testReport)** for PR 32852 at commit [`dc47094`](https://github.com/apache/spark/commit/dc470948199ea55dc2db94c0108d8212e28a4e88). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-858386822 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44148/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] eejbyfeldt commented on pull request #32783: [SPARK-35653][SQL] Fix CatalystToExternalMap interpreted path fails for Map with case classes as keys or values
eejbyfeldt commented on pull request #32783: URL: https://github.com/apache/spark/pull/32783#issuecomment-858394340 > `StructConverter` converts from Catalyst struct to a `Row`. Will it be a behavior change? Although it is Catalyst expression. From my understanding this is exactly the bug being fixed. So there will be behavior change. But the behavior is changed such that the interpreted path and the code gen path has the same behavior. The thinking being that the old behavior was undesired and incorrect. My understanding of the failure the test cases (if added without the patch): ``` [info] - encode/decode for map with case class as value: Map(1 -> IntAndString(1,a)) (interpreted path) *** FAILED *** (64 milliseconds) [info] Encoded/Decoded data does not match input data [info] [info] in: Map(1 -> IntAndString(1,a)) [info] out: Map(1 -> [1,a]) ``` Is that the value of the `Map` was converted to an `[1,a]` which I believe is an `InternalRow` instead of the expected `IntAndString`. This sounds like the change you are mentioning? The reason I believe my change is the correct is that using the `key/valueLambdaFunction` is since this is how it is done inside `MapObjects`: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala#L820-L826 Please correct me if I am wrong about anything or misunderstood the question as I am new to the code base. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32828: [SPARK-35689][SS] Add log warn when keyWithIndexToValue returns null value
SparkQA commented on pull request #32828: URL: https://github.com/apache/spark/pull/32828#issuecomment-858394064 **[Test build #139625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139625/testReport)** for PR 32828 at commit [`d15b60b`](https://github.com/apache/spark/commit/d15b60ba241703a3dec349b84d2f9218006e804b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
SparkQA commented on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858471292 **[Test build #139629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139629/testReport)** for PR 32470 at commit [`ae4d411`](https://github.com/apache/spark/commit/ae4d411bd68fcd9daf4db23a7a33b4fdca478528). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
AmplabJenkins commented on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858471325 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139629/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
beliefer commented on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858500113 > https://dev.mysql.com/doc/refman/8.0/en/extended-show.html @wangyum There is a discussion below. https://github.com/apache/spark/pull/31548#issuecomment-814170427 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhouyejoe commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data
zhouyejoe commented on a change in pull request #32007: URL: https://github.com/apache/spark/pull/32007#discussion_r648874848 ## File path: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ## @@ -153,6 +185,82 @@ private[spark] class DiskBlockManager(conf: SparkConf, deleteFilesOnStop: Boolea } } + /** + * Get the list of configured local dirs storing merged shuffle blocks created by executors + * if push based shuffle is enabled. Note that the files in this directory will be created + * by the external shuffle services. We only create the merge_manager directories and + * subdirectories here because currently the external shuffle service doesn't have + * permission to create directories under application local directories. + */ + private def createLocalDirsForMergedShuffleBlocks(): Unit = { +if (Utils.isPushBasedShuffleEnabled(conf)) { + // Will create the merge_manager directory only if it doesn't exist under the local dir. + Utils.getConfiguredLocalDirs(conf).foreach { rootDir => +try { + val mergeDir = new File(rootDir, MERGE_MANAGER_DIR) + if (!mergeDir.exists()) { +// This executor does not find merge_manager directory, it will try to create +// the merge_manager directory and the sub directories. +logDebug(s"Try to create $mergeDir and its sub dirs since the " + + s"$MERGE_MANAGER_DIR dir does not exist") +for (dirNum <- 0 until subDirsPerLocalDir) { + val subDir = new File(mergeDir, "%02x".format(dirNum)) + if (!subDir.exists()) { +// Only one container will create this directory. The filesystem will handle +// any race conditions. +createDirWithCustomizedPermission(subDir, "770") + } +} + } + logInfo(s"Merge directory and its sub dirs get created at $mergeDir") +} catch { + case e: IOException => +logError( + s"Failed to create $MERGE_MANAGER_DIR dir in $rootDir. Ignoring this directory.", e) +} + } +} + } + + /** + * Create a directory that is writable by the group. + * Grant the customized permission so the shuffle server can + * create subdirs/files within the merge folder. + * TODO: Find out why can't we create a dir using java api with permission 770 + * Files.createDirectories(mergeDir.toPath, PosixFilePermissions.asFileAttribute( + * PosixFilePermissions.fromString("rwxrwx---"))) + */ + def createDirWithCustomizedPermission(dirToCreate: File, permission: String): Unit = { Review comment: Updated the method name and revert back to only mkdir with permission 770. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 opened a new pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
imback82 opened a new pull request #32854: URL: https://github.com/apache/spark/pull/32854 ### What changes were proposed in this pull request? This PR proposes to migrate the following `ALTER TABLE ... DROP COLUMNS` command to use `UnresolvedTable` as a `child` to resolve the table identifier. This allows consistent resolution rules (temp view first, etc.) to be applied for both v1/v2 commands. More info about the consistent resolution rule proposal can be found in [JIRA](https://issues.apache.org/jira/browse/SPARK-29900) or [proposal doc](https://docs.google.com/document/d/1hvLjGA8y_W_hhilpngXVub1Ebv8RsMap986nENCFnrg/edit?usp=sharing). ### Why are the changes needed? This is a part of effort to make the relation lookup behavior consistent: [SPARK-29900](https://issues.apache.org/jira/browse/SPARK-29900). ### Does this PR introduce _any_ user-facing change? After this PR, the above `ALTER TABLE ... DROP COLUMNS` commands will have a consistent resolution behavior. ### How was this patch tested? Updated existing tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhouyejoe commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data
zhouyejoe commented on a change in pull request #32007: URL: https://github.com/apache/spark/pull/32007#discussion_r648874572 ## File path: core/src/main/scala/org/apache/spark/util/Utils.scala ## @@ -2566,11 +2601,28 @@ private[spark] object Utils extends Logging { } /** - * Push based shuffle can only be enabled when external shuffle service is enabled. + * Push based shuffle can only be enabled when the application is submitted + * to run in YARN mode, with external shuffle service enabled and + * spark.yarn.maxAttempts or the yarn cluster default max attempts is set to 1. + * TODO: SPARK-35546 Support push based shuffle with multiple app attempts */ def isPushBasedShuffleEnabled(conf: SparkConf): Boolean = { conf.get(PUSH_BASED_SHUFFLE_ENABLED) && - (conf.get(IS_TESTING).getOrElse(false) || conf.get(SHUFFLE_SERVICE_ENABLED)) + (conf.get(IS_TESTING).getOrElse(false) || +(conf.get(SHUFFLE_SERVICE_ENABLED) && + conf.get(SparkLauncher.SPARK_MASTER, null) == "yarn") && + getYarnMaxAttempts(conf) == 1) + } + + /** Returns the maximum number of attempts to register the AM in YARN mode. */ + def getYarnMaxAttempts(conf: SparkConf): Int = { + val sparkMaxAttempts = conf.getOption("spark.yarn.maxAttempts").map(_.toInt) + val yarnMaxAttempts = getSparkOrYarnConfig(conf, YarnConfiguration.RM_AM_MAX_ATTEMPTS, +YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS.toString).toInt + sparkMaxAttempts match { +case Some(x) => if (x <= yarnMaxAttempts) x else yarnMaxAttempts +case None => yarnMaxAttempts + } Review comment: Added comment that this method will be removed after SPARK-35546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32841: [SPARK-35673][SQL] Fix user-defined hint and unrecognized hint in subquery.
SparkQA commented on pull request #32841: URL: https://github.com/apache/spark/pull/32841#issuecomment-858343681 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerqi closed pull request #32856: [SPARK-35706][SQL]Consider making the ':' in STRUCT data type definition optional
jerqi closed pull request #32856: URL: https://github.com/apache/spark/pull/32856 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858350174 **[Test build #139620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139620/testReport)** for PR 32821 at commit [`a07b94f`](https://github.com/apache/spark/commit/a07b94ff01ebade9e79709d4438093eac4711c5e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32855: [SPARK-34524][SQL][FOLLOWUP] Remove unused checkAlterTablePartition in CheckAnalysis.scala
SparkQA commented on pull request #32855: URL: https://github.com/apache/spark/pull/32855#issuecomment-858350181 **[Test build #139618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139618/testReport)** for PR 32855 at commit [`470ff35`](https://github.com/apache/spark/commit/470ff350f42707f3061d9bfb23e0649fe78f6506). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #32854: URL: https://github.com/apache/spark/pull/32854#issuecomment-858350117 **[Test build #139619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139619/testReport)** for PR 32854 at commit [`0295845`](https://github.com/apache/spark/commit/0295845c22f29c67c41e2aff50835b7478ca461e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-858350225 **[Test build #139621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139621/testReport)** for PR 32816 at commit [`094ee0c`](https://github.com/apache/spark/commit/094ee0cf7d3895c97b85c1c6935a6d93e3175acd). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32815: [SPARK-35675][SQL] EnsureRequirements remove shuffle should respect PartitioningCollection
AmplabJenkins removed a comment on pull request #32815: URL: https://github.com/apache/spark/pull/32815#issuecomment-858296344 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44132/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32815: [SPARK-35675][SQL] EnsureRequirements remove shuffle should respect PartitioningCollection
SparkQA removed a comment on pull request #32815: URL: https://github.com/apache/spark/pull/32815#issuecomment-858231569 **[Test build #139605 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139605/testReport)** for PR 32815 at commit [`7caddac`](https://github.com/apache/spark/commit/7caddac6702822b12778a958b87c0d65bbb00c96). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #32822: [SPARK-35678][ML] add a common softmax function
zhengruifeng commented on pull request #32822: URL: https://github.com/apache/spark/pull/32822#issuecomment-858370600 ping @srowen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #32841: [SPARK-35673][SQL] Fix user-defined hint and unrecognized hint in subquery.
cloud-fan commented on pull request #32841: URL: https://github.com/apache/spark/pull/32841#issuecomment-858386217 thanks, merging to master/3.1/3.0! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #32841: [SPARK-35673][SQL] Fix user-defined hint and unrecognized hint in subquery.
cloud-fan closed pull request #32841: URL: https://github.com/apache/spark/pull/32841 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
SparkQA commented on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858396906 **[Test build #139612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139612/testReport)** for PR 32470 at commit [`d08d8e4`](https://github.com/apache/spark/commit/d08d8e4cb6acfaff34e3d81cc8b41c65aa34f4f6). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32842: [MINOR][SQL] No need to normolize name for built-in functions
SparkQA commented on pull request #32842: URL: https://github.com/apache/spark/pull/32842#issuecomment-858396774 **[Test build #139607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139607/testReport)** for PR 32842 at commit [`60dcf66`](https://github.com/apache/spark/commit/60dcf66c8b9b2bb3e1785bc24983b927411c78dc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32857: [SPARK-35707][ML] optimize sparse GEMM by skipping bound checking
SparkQA commented on pull request #32857: URL: https://github.com/apache/spark/pull/32857#issuecomment-858415585 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44149/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858427341 **[Test build #139628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139628/testReport)** for PR 32821 at commit [`4afb09c`](https://github.com/apache/spark/commit/4afb09c5d970696352d0eda72dcf26900be725f1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yos1p edited a comment on pull request #32795: [SPARK-35588][PYTHON][DOCS] Update quickstart.ipynb to use pyspark.pandas
yos1p edited a comment on pull request #32795: URL: https://github.com/apache/spark/pull/32795#issuecomment-856936411 > You might need to push some changes at https://github.com/apache/spark/blob/master/binder/postBuild to build Spark, install extra dependencies, etc. It seems mybinder not able to include build spark. The RAM for mybinder at least 1 gigabyte. This cause building Spark to fail because not enough heap memory space. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps
SparkQA commented on pull request #32859: URL: https://github.com/apache/spark/pull/32859#issuecomment-858427215 **[Test build #139626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139626/testReport)** for PR 32859 at commit [`2745bd6`](https://github.com/apache/spark/commit/2745bd6158e672e2ae98d79d1cbfb4878110695d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32828: [SPARK-35689][SS] Add log warn when keyWithIndexToValue returns null value
SparkQA commented on pull request #32828: URL: https://github.com/apache/spark/pull/32828#issuecomment-858427294 **[Test build #139627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139627/testReport)** for PR 32828 at commit [`6d3fcef`](https://github.com/apache/spark/commit/6d3fcefa88ed43f38df8f3893fa73f67b1584005). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32857: [SPARK-35707][ML] optimize sparse GEMM by skipping bound checking
AmplabJenkins commented on pull request #32857: URL: https://github.com/apache/spark/pull/32857#issuecomment-858432871 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139622/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32849: [WIP][SPARK-35704][SQL] Support fields by the day-time interval type
cloud-fan commented on a change in pull request #32849: URL: https://github.com/apache/spark/pull/32849#discussion_r648978557 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -158,7 +158,10 @@ object IntervalUtils { private val daySecondLiteralRegex = (s"(?i)^INTERVAL\\s+([+|-])?\\'$daySecondPatternString\\'\\s+DAY\\s+TO\\s+SECOND$$").r - def castStringToDTInterval(input: UTF8String): Long = { + def castStringToDTInterval( + input: UTF8String, + // TODO(SPARK-X): Take into account day-time interval fields in cast + it: DayTimeIntervalType): Long = { Review comment: we can just pass 2 bytes as the input -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858466240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps
AmplabJenkins removed a comment on pull request #32859: URL: https://github.com/apache/spark/pull/32859#issuecomment-858466246 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139626/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858464559 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44155/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #32854: URL: https://github.com/apache/spark/pull/32854#issuecomment-858486749 **[Test build #139619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139619/testReport)** for PR 32854 at commit [`0295845`](https://github.com/apache/spark/commit/0295845c22f29c67c41e2aff50835b7478ca461e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class AlterTableDropColumns(` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA removed a comment on pull request #32854: URL: https://github.com/apache/spark/pull/32854#issuecomment-858350117 **[Test build #139619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139619/testReport)** for PR 32854 at commit [`0295845`](https://github.com/apache/spark/commit/0295845c22f29c67c41e2aff50835b7478ca461e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32815: [SPARK-35675][SQL] EnsureRequirements remove shuffle should respect PartitioningCollection
AmplabJenkins commented on pull request #32815: URL: https://github.com/apache/spark/pull/32815#issuecomment-858354919 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139605/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm commented on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage
mridulm commented on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-858362869 jenkins, test this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858376468 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44143/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage
SparkQA commented on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-858387633 **[Test build #139624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139624/testReport)** for PR 30691 at commit [`e570818`](https://github.com/apache/spark/commit/e570818c65c0f230ec5d1a9afde3fd213c9a40ad). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858402837 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44147/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerqi opened a new pull request #32858: [SPARK-35706][SQL]Consider making the ':' in STRUCT data type definit…
jerqi opened a new pull request #32858: URL: https://github.com/apache/spark/pull/32858 ### What changes were proposed in this pull request? The STRUCT type syntax is defined like this: STRUCT(fieldNmae: fileType [NOT NULL][COMMENT stringLiteral][,.]) So the field list is nearly the same as a column list if we could make ':' optional it would be so much cleaner an less proprietary ### Why are the changes needed? ease of use ### Does this PR introduce _any_ user-facing change? Yes, you can use Struct type list is nearly the same as a column list ### How was this patch tested? unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Yikun opened a new pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps
Yikun opened a new pull request #32859: URL: https://github.com/apache/spark/pull/32859 ### What changes were proposed in this pull request? This patch adds DataTypeOps test to check the ops is loaded as expected. ### Why are the changes needed? When complete https://github.com/apache/spark/pull/32821, I found there are no test for DataTypeOps. There were many logic when DataTypeOps loaded, it's better to add the test to make sure interface stable. ### Does this PR introduce _any_ user-facing change? No, test only ### How was this patch tested? test passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32828: [SPARK-35689][SS] Add log warn when keyWithIndexToValue returns null value
SparkQA commented on pull request #32828: URL: https://github.com/apache/spark/pull/32828#issuecomment-858438256 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44152/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32849: [WIP][SPARK-35704][SQL] Support fields by the day-time interval type
cloud-fan commented on a change in pull request #32849: URL: https://github.com/apache/spark/pull/32849#discussion_r648984100 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala ## @@ -57,16 +62,61 @@ class DayTimeIntervalType private() extends AtomicType { private[spark] override def asNullable: DayTimeIntervalType = this - override def typeName: String = "interval day to second" + override val typeName: String = { +val startFieldName = fieldToString(startField) +val endFieldName = fieldToString(endField) +if (startFieldName == endFieldName) { + s"interval $startFieldName" +} else if (startField < endField) { + s"interval $startFieldName to $endFieldName" +} else { + throw QueryCompilationErrors.invalidDayTimeIntervalType(startFieldName, endFieldName) +} + } } /** - * The companion case object and its class is separated so the companion object also subclasses - * the DayTimeIntervalType class. Otherwise, the companion object would be of type - * "DayTimeIntervalType$" in byte code. Defined with a private constructor so the companion object - * is the only possible instantiation. + * Extra factory methods and pattern matchers for DayTimeIntervalType. * * @since 3.2.0 */ @Unstable -case object DayTimeIntervalType extends DayTimeIntervalType +case object DayTimeIntervalType extends AbstractDataType { + val DAY: Byte = 0 + val HOUR: Byte = 1 + val MINUTE: Byte = 2 + val SECOND: Byte = 3 + val dayTimeFields = Seq(DAY, HOUR, MINUTE, SECOND) + + def fieldToString(field: Byte): String = field match { +case DAY => "day" +case HOUR => "hour" +case MINUTE => "minute" +case SECOND => "second" +case invalid => throw QueryCompilationErrors.invalidDayTimeField(invalid) + } + + val DEFAULT = DayTimeIntervalType(DAY, SECOND) + + def apply(): DayTimeIntervalType = DEFAULT + + override private[sql] def defaultConcreteType: DataType = DEFAULT + + override private[sql] def acceptsType(other: DataType): Boolean = { +other.isInstanceOf[DayTimeIntervalType] + } + + override private[sql] def simpleString: String = defaultConcreteType.simpleString + + def dayTimeIntervalTypes(): Seq[DayTimeIntervalType] = Seq( Review comment: if it's test only, let's move it to the test code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
SparkQA commented on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858451052 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44150/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
SparkQA removed a comment on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858467500 **[Test build #139629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139629/testReport)** for PR 32470 at commit [`ae4d411`](https://github.com/apache/spark/commit/ae4d411bd68fcd9daf4db23a7a33b4fdca478528). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
AmplabJenkins removed a comment on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858471325 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139629/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE
AmplabJenkins removed a comment on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-858347806 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32841: [SPARK-35673][SQL] Fix user-defined hint and unrecognized hint in subquery.
AmplabJenkins removed a comment on pull request #32841: URL: https://github.com/apache/spark/pull/32841#issuecomment-858347803 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
AmplabJenkins removed a comment on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858347804 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32853: [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side
AmplabJenkins removed a comment on pull request #32853: URL: https://github.com/apache/spark/pull/32853#issuecomment-858347805 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139614/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
AmplabJenkins removed a comment on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858347812 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44137/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
AmplabJenkins commented on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858347812 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44137/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32470: [WIP] Simplify ResolveAggregateFunctions
SparkQA removed a comment on pull request #32470: URL: https://github.com/apache/spark/pull/32470#issuecomment-858290812 **[Test build #139612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139612/testReport)** for PR 32470 at commit [`d08d8e4`](https://github.com/apache/spark/commit/d08d8e4cb6acfaff34e3d81cc8b41c65aa34f4f6). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32842: [MINOR][SQL] No need to normolize name for built-in functions
SparkQA removed a comment on pull request #32842: URL: https://github.com/apache/spark/pull/32842#issuecomment-858252089 **[Test build #139607 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139607/testReport)** for PR 32842 at commit [`60dcf66`](https://github.com/apache/spark/commit/60dcf66c8b9b2bb3e1785bc24983b927411c78dc). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32828: [SPARK-35689][SS] Add log warn when keyWithIndexToValue returns null value
SparkQA removed a comment on pull request #32828: URL: https://github.com/apache/spark/pull/32828#issuecomment-858394064 **[Test build #139625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139625/testReport)** for PR 32828 at commit [`d15b60b`](https://github.com/apache/spark/commit/d15b60ba241703a3dec349b84d2f9218006e804b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32849: [WIP][SPARK-35704][SQL] Support fields by the day-time interval type
cloud-fan commented on a change in pull request #32849: URL: https://github.com/apache/spark/pull/32849#discussion_r648970057 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala ## @@ -95,7 +95,7 @@ case class Average(child: Expression) extends DeclarativeAggregate with Implicit Literal(null, YearMonthIntervalType), DivideYMInterval(sum, count)) case _: DayTimeIntervalType => If(EqualTo(count, Literal(0L)), -Literal(null, DayTimeIntervalType), DivideDTInterval(sum, count)) +Literal(null, DayTimeIntervalType.defaultConcreteType), DivideDTInterval(sum, count)) Review comment: shall we just use `child.dataType`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA removed a comment on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858427341 **[Test build #139628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139628/testReport)** for PR 32821 at commit [`4afb09c`](https://github.com/apache/spark/commit/4afb09c5d970696352d0eda72dcf26900be725f1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE
SparkQA commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-858463477 **[Test build #139613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139613/testReport)** for PR 32776 at commit [`aee1392`](https://github.com/apache/spark/commit/aee1392720815f332e8fb993b4672bb03fe4ccb1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE
SparkQA removed a comment on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-858294837 **[Test build #139613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139613/testReport)** for PR 32776 at commit [`aee1392`](https://github.com/apache/spark/commit/aee1392720815f332e8fb993b4672bb03fe4ccb1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #32470: [WIP] Simplify ResolveAggregateFunctions
cloud-fan commented on a change in pull request #32470: URL: https://github.com/apache/spark/pull/32470#discussion_r649008968 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q35.sf100/explain.txt ## @@ -306,11 +306,11 @@ Input [19]: [ca_state#25, cd_gender#29, cd_marital_status#30, cd_dep_count#31, c Keys [6]: [ca_state#25, cd_gender#29, cd_marital_status#30, cd_dep_count#31, cd_dep_employed_count#32, cd_dep_college_count#33] Functions [10]: [count(1), min(cd_dep_count#31), max(cd_dep_count#31), avg(cd_dep_count#31), min(cd_dep_employed_count#32), max(cd_dep_employed_count#32), avg(cd_dep_employed_count#32), min(cd_dep_college_count#33), max(cd_dep_college_count#33), avg(cd_dep_college_count#33)] Aggregate Attributes [10]: [count(1)#62, min(cd_dep_count#31)#63, max(cd_dep_count#31)#64, avg(cd_dep_count#31)#65, min(cd_dep_employed_count#32)#66, max(cd_dep_employed_count#32)#67, avg(cd_dep_employed_count#32)#68, min(cd_dep_college_count#33)#69, max(cd_dep_college_count#33)#70, avg(cd_dep_college_count#33)#71] -Results [18]: [ca_state#25, cd_gender#29, cd_marital_status#30, count(1)#62 AS cnt1#72, min(cd_dep_count#31)#63 AS min(cd_dep_count)#73, max(cd_dep_count#31)#64 AS max(cd_dep_count)#74, avg(cd_dep_count#31)#65 AS avg(cd_dep_count)#75, cd_dep_employed_count#32, count(1)#62 AS cnt2#76, min(cd_dep_employed_count#32)#66 AS min(cd_dep_employed_count)#77, max(cd_dep_employed_count#32)#67 AS max(cd_dep_employed_count)#78, avg(cd_dep_employed_count#32)#68 AS avg(cd_dep_employed_count)#79, cd_dep_college_count#33, count(1)#62 AS cnt3#80, min(cd_dep_college_count#33)#69 AS min(cd_dep_college_count)#81, max(cd_dep_college_count#33)#70 AS max(cd_dep_college_count)#82, avg(cd_dep_college_count#33)#71 AS avg(cd_dep_college_count)#83, cd_dep_count#31 AS aggOrder#84] Review comment: the only change is the removed alias -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tanelk commented on pull request #32862: [SPARK-35695][SQL] Collect observed metrics from cached sub-tree
tanelk commented on pull request #32862: URL: https://github.com/apache/spark/pull/32862#issuecomment-858494892 Pinging @cloud-fan , @sarutak -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32853: [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side
SparkQA commented on pull request #32853: URL: https://github.com/apache/spark/pull/32853#issuecomment-858335817 **[Test build #139614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139614/testReport)** for PR 32853 at commit [`61a5c92`](https://github.com/apache/spark/commit/61a5c92ab303dd6bb6aa65e24abb644255e4fa15). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE
SparkQA commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-858335819 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32853: [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side
SparkQA removed a comment on pull request #32853: URL: https://github.com/apache/spark/pull/32853#issuecomment-858316127 **[Test build #139614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139614/testReport)** for PR 32853 at commit [`61a5c92`](https://github.com/apache/spark/commit/61a5c92ab303dd6bb6aa65e24abb644255e4fa15). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858351005 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44143/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858350628 **[Test build #139620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139620/testReport)** for PR 32821 at commit [`a07b94f`](https://github.com/apache/spark/commit/a07b94ff01ebade9e79709d4438093eac4711c5e). * This patch **fails RAT tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins removed a comment on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858350651 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139620/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858350651 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139620/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA removed a comment on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858350174 **[Test build #139620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139620/testReport)** for PR 32821 at commit [`a07b94f`](https://github.com/apache/spark/commit/a07b94ff01ebade9e79709d4438093eac4711c5e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dgd-contributor commented on pull request #32839: [SPARK-35679][SQL] instantToMicros overflow
dgd-contributor commented on pull request #32839: URL: https://github.com/apache/spark/pull/32839#issuecomment-858366444 Thanks @MaxGekk @cloud-fan @gengliangwang for supporting -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on pull request #32857: [SPARK-35707][ML] optimize sparse GEMM by skipping bound checking
zhengruifeng commented on pull request #32857: URL: https://github.com/apache/spark/pull/32857#issuecomment-858366739 ![image](https://user-images.githubusercontent.com/7322292/121479250-6a762000-c9fc-11eb-9ca2-1331ec8fffb8.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32855: [SPARK-34524][SQL][FOLLOWUP] Remove unused checkAlterTablePartition in CheckAnalysis.scala
SparkQA commented on pull request #32855: URL: https://github.com/apache/spark/pull/32855#issuecomment-858388065 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44145/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-858409096 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44148/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
wangyum commented on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858417308 This is MySQL syntax: ``` mysql> SHOW CHARACTER SET WHERE `Default collation` LIKE '%japanese%'; +-+---+-++ | Charset | Description | Default collation | Maxlen | +-+---+-++ | ujis| EUC-JP Japanese | ujis_japanese_ci| 3 | | sjis| Shift-JIS Japanese| sjis_japanese_ci| 2 | | cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 | | eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 | +-+---+-++ ``` https://dev.mysql.com/doc/refman/8.0/en/extended-show.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage
SparkQA commented on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-858426166 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44151/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32858: [SPARK-35706][SQL]Consider making the ':' in STRUCT data type definit…
AmplabJenkins commented on pull request #32858: URL: https://github.com/apache/spark/pull/32858#issuecomment-858425281 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32852: [SPARK-35283][SQL] Support query some DDL with CTES
SparkQA commented on pull request #32852: URL: https://github.com/apache/spark/pull/32852#issuecomment-858425912 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44150/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
SparkQA commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858454242 **[Test build #139628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139628/testReport)** for PR 32821 at commit [`4afb09c`](https://github.com/apache/spark/commit/4afb09c5d970696352d0eda72dcf26900be725f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32859: [SPARK-35708][PYTHON][TEST] Add BaseTest for DataTypeOps
SparkQA commented on pull request #32859: URL: https://github.com/apache/spark/pull/32859#issuecomment-858490830 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44153/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerqi opened a new pull request #32856: [SPARK-35706][SQL]Consider making the ':' in STRUCT data type definition optional
jerqi opened a new pull request #32856: URL: https://github.com/apache/spark/pull/32856 ### What changes were proposed in this pull request? The STRUCT type syntax is defined like this: STRUCT(fieldNmae: fileType [NOT NULL][COMMENT stringLiteral][,.]) So the field list is NEARLY the same as a column list if we could make ':' optional it would be so much cleaner an less proprietary ### Why are the changes needed? ease of use ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32856: [SPARK-35706][SQL]Consider making the ':' in STRUCT data type definition optional
AmplabJenkins commented on pull request #32856: URL: https://github.com/apache/spark/pull/32856#issuecomment-858349898 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32854: [SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #32854: URL: https://github.com/apache/spark/pull/32854#issuecomment-858378455 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44146/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32853: [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side
AmplabJenkins commented on pull request #32853: URL: https://github.com/apache/spark/pull/32853#issuecomment-858384277 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858384275 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44143/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins removed a comment on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858384275 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44143/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32853: [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side
AmplabJenkins removed a comment on pull request #32853: URL: https://github.com/apache/spark/pull/32853#issuecomment-858384277 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage
AmplabJenkins removed a comment on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-858391673 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139624/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage
SparkQA removed a comment on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-858387633 **[Test build #139624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139624/testReport)** for PR 30691 at commit [`e570818`](https://github.com/apache/spark/commit/e570818c65c0f230ec5d1a9afde3fd213c9a40ad). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32828: [SPARK-35689][SS] Add log warn when keyWithIndexToValue returns null value
viirya commented on pull request #32828: URL: https://github.com/apache/spark/pull/32828#issuecomment-858400846 Added test. Please take look. Thanks. @HeartSaVioR @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32855: [SPARK-34524][SQL][FOLLOWUP] Remove unused checkAlterTablePartition in CheckAnalysis.scala
SparkQA commented on pull request #32855: URL: https://github.com/apache/spark/pull/32855#issuecomment-858410582 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44145/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32855: [SPARK-34524][SQL][FOLLOWUP] Remove unused checkAlterTablePartition in CheckAnalysis.scala
AmplabJenkins commented on pull request #32855: URL: https://github.com/apache/spark/pull/32855#issuecomment-858424084 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44145/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins commented on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858424088 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44147/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32821: [SPARK-35342][PYTHON] Introduce DecimalOps
AmplabJenkins removed a comment on pull request #32821: URL: https://github.com/apache/spark/pull/32821#issuecomment-858424088 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44147/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org