date:20220403

[GitHub] [spark] itholic opened a new pull request, #36059: [SPARK-38780][FOLLOWUP][PYTHON][DOCS] PySpark docs build should fail when there is warning.

2022-04-03 Thread GitBox

itholic opened a new pull request, #36059: URL: https://github.com/apache/spark/pull/36059 ### What changes were proposed in this pull request? This PR proposes to remove `ForeachBatchFunction` and `StreamingQueryException` from `python/docs/source/reference/pyspark.ss.rst` since the

[GitHub] [spark] dongjoon-hyun commented on pull request #35979: [SPARK-38664][CORE] Support compact EventLog when there are illegal characters in the path

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #35979: URL: https://github.com/apache/spark/pull/35979#issuecomment-1087173496 Yes, I believe `illegal char` is the root cause of this problem, not Apache Spark, @lw33 . Apache Hadoop provides an abstraction layer for many file systems including local file

[GitHub] [spark] HyukjinKwon closed pull request #36058: [SPARK-38780][PYTHON][DOCS] PySpark docs build should fail when there is warning.

2022-04-03 Thread GitBox

HyukjinKwon closed pull request #36058: [SPARK-38780][PYTHON][DOCS] PySpark docs build should fail when there is warning. URL: https://github.com/apache/spark/pull/36058 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon commented on pull request #36058: [SPARK-38780][PYTHON][DOCS] PySpark docs build should fail when there is warning.

2022-04-03 Thread GitBox

HyukjinKwon commented on PR #36058: URL: https://github.com/apache/spark/pull/36058#issuecomment-1087161067 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon commented on pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

HyukjinKwon commented on PR #36057: URL: https://github.com/apache/spark/pull/36057#issuecomment-1087159865 Merged to master, branch-3.3, and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon closed pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

HyukjinKwon closed pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings URL: https://github.com/apache/spark/pull/36057 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] itholic commented on pull request #36058: [SPARK-38780][PYTHON][DOCS] PySpark docs build should fail when there is warning.

2022-04-03 Thread GitBox

itholic commented on PR #36058: URL: https://github.com/apache/spark/pull/36058#issuecomment-1087114454 Let me re-trigger the build with rebasing master after https://github.com/apache/spark/pull/36057 is merged. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [spark] itholic commented on pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

itholic commented on PR #36057: URL: https://github.com/apache/spark/pull/36057#issuecomment-1087113553 Just opened a PR at https://github.com/apache/spark/pull/36058 to make warning to be failed. So, let me cherry-pick this fix to the opened PR after merging. -- This is an automat

[GitHub] [spark] itholic opened a new pull request, #36058: [SPARK-38780][PYTHON][DOCS] PySpark docs build should fail when there is warning.

2022-04-03 Thread GitBox

itholic opened a new pull request, #36058: URL: https://github.com/apache/spark/pull/36058 ### What changes were proposed in this pull request? This PR proposes to add option "-W" when running PySpark documentation build via Sphinx. ### Why are the changes needed? To

[GitHub] [spark] itholic commented on pull request #34324: [SPARK-37015][PYTHON] Inline type hints for python/pyspark/streaming/dstream.py

2022-04-03 Thread GitBox

itholic commented on PR #34324: URL: https://github.com/apache/spark/pull/34324#issuecomment-1087105177 Also mind taking a last look for this, @zero323 ?? 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] itholic commented on pull request #34293: [SPARK-37014][PYTHON] Inline type hints for python/pyspark/streaming/context.py

2022-04-03 Thread GitBox

itholic commented on PR #34293: URL: https://github.com/apache/spark/pull/34293#issuecomment-1087104988 Seems fine to me. Would you mind taking a last look for this, @zero323 ?? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] itholic commented on pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

itholic commented on PR #36057: URL: https://github.com/apache/spark/pull/36057#issuecomment-1087089407 @HyukjinKwon sure, let me take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] HyukjinKwon commented on pull request #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

HyukjinKwon commented on PR #36057: URL: https://github.com/apache/spark/pull/36057#issuecomment-1087087712 cc @xinrong-databricks and @zero323 FYI. @itholic BTW, I remember we talked about warnings in Sphinx build before. I think it should fail for these warnings but not sure why it

[GitHub] [spark] HyukjinKwon opened a new pull request, #36057: [MINOR][DOCS] Remove PySpark doc build warnings

2022-04-03 Thread GitBox

HyukjinKwon opened a new pull request, #36057: URL: https://github.com/apache/spark/pull/36057 ### What changes were proposed in this pull request? This PR fixes a various documentation build warnings in PySpark documentation ### Why are the changes needed? To render the

[GitHub] [spark] AngersZhuuuu opened a new pull request, #36056: [WIP][SPARK-36571][SQL] Add an SQLOverwriteHadoopMapReduceCommitProtocol to support all SQL overwrite write data to staging dir

2022-04-03 Thread GitBox

AngersZh opened a new pull request, #36056: URL: https://github.com/apache/spark/pull/36056 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] huaxingao commented on pull request #36050: [SPARK-38779] [SQL][Tests] Unify the pushed operator checking between FileSource test suite and JDBCV2Suite

2022-04-03 Thread GitBox

huaxingao commented on PR #36050: URL: https://github.com/apache/spark/pull/36050#issuecomment-1087061975 Thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [spark] wangyum commented on pull request #36047: [SPARK-32268][SQL][FOLLOWUP] Add ColumnPruning in injectBloomFilter

2022-04-03 Thread GitBox

wangyum commented on PR #36047: URL: https://github.com/apache/spark/pull/36047#issuecomment-1087059583 Merged to master and branch-3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] wangyum closed pull request #36047: [SPARK-32268][SQL][FOLLOWUP] Add ColumnPruning in injectBloomFilter

2022-04-03 Thread GitBox

wangyum closed pull request #36047: [SPARK-32268][SQL][FOLLOWUP] Add ColumnPruning in injectBloomFilter URL: https://github.com/apache/spark/pull/36047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] lw33 commented on pull request #35979: [SPARK-38664][CORE] Support compact EventLog when there are illegal characters in the path

2022-04-03 Thread GitBox

lw33 commented on PR #35979: URL: https://github.com/apache/spark/pull/35979#issuecomment-1087033057 Yes, maybe we don't need to do this change. I just found this problem when compacting event log, the event log could write to the path, but compat failed, so I thought this might be a bug. O

[GitHub] [spark] dongjoon-hyun closed pull request #36050: [SPARK-38779] [SQL][Tests] Unify the pushed operator checking between FileSource test suite and JDBCV2Suite

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #36050: [SPARK-38779] [SQL][Tests] Unify the pushed operator checking between FileSource test suite and JDBCV2Suite URL: https://github.com/apache/spark/pull/36050 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] viirya closed pull request #36055: [SPARK-34863][SQL][FOLLOWUP] Disable `spark.sql.parquet.enableNestedColumnVectorizedReader` by default

2022-04-03 Thread GitBox

viirya closed pull request #36055: [SPARK-34863][SQL][FOLLOWUP] Disable `spark.sql.parquet.enableNestedColumnVectorizedReader` by default URL: https://github.com/apache/spark/pull/36055 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [spark] viirya commented on pull request #36055: [SPARK-34863][SQL][FOLLOWUP] Disable `spark.sql.parquet.enableNestedColumnVectorizedReader` by default

2022-04-03 Thread GitBox

viirya commented on PR #36055: URL: https://github.com/apache/spark/pull/36055#issuecomment-1087021571 Thanks. Merging to master/3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] zhengruifeng commented on pull request #36048: [SPARK-38774][PYTHON] Implement Series.autocorr

2022-04-03 Thread GitBox

zhengruifeng commented on PR #36048: URL: https://github.com/apache/spark/pull/36048#issuecomment-1086995190 @xinrong-databricks Will add the tests and update the PR description, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #36049: [SPARK-38775][ML] cleanup validation functions

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #36049: URL: https://github.com/apache/spark/pull/36049#issuecomment-1086992390 Thank you so much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] zhengruifeng commented on pull request #36049: [SPARK-38775][ML] cleanup validation functions

2022-04-03 Thread GitBox

zhengruifeng commented on PR #36049: URL: https://github.com/apache/spark/pull/36049#issuecomment-1086990897 @dongjoon-hyun Ok, I will hold on this PR since its target version is 3.4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [spark] github-actions[bot] commented on pull request #33257: [SPARK-36039][K8S] Fix executor pod hadoop conf mount

2022-04-03 Thread GitBox

github-actions[bot] commented on PR #33257: URL: https://github.com/apache/spark/pull/33257#issuecomment-1086983267 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #34629: [SPARK-37355][CORE]Avoid Block Manager registrations when Executor is shutting down

2022-04-03 Thread GitBox

github-actions[bot] closed pull request #34629: [SPARK-37355][CORE]Avoid Block Manager registrations when Executor is shutting down URL: https://github.com/apache/spark/pull/34629 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [spark] github-actions[bot] closed pull request #34953: [SPARK-37682][SQL]Apply 'merged column' and 'bit vector' in RewriteDistinctAggregates

2022-04-03 Thread GitBox

github-actions[bot] closed pull request #34953: [SPARK-37682][SQL]Apply 'merged column' and 'bit vector' in RewriteDistinctAggregates URL: https://github.com/apache/spark/pull/34953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] commented on pull request #34990: [SPARK-37717][SQL] Improve logging in BroadcastExchangeExec

2022-04-03 Thread GitBox

github-actions[bot] commented on PR #34990: URL: https://github.com/apache/spark/pull/34990#issuecomment-1086983247 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] HyukjinKwon commented on pull request #36038: [WIP][SPARK-38759][PYTHON][SS] Add StreamingQueryListener support in PySpark

2022-04-03 Thread GitBox

HyukjinKwon commented on PR #36038: URL: https://github.com/apache/spark/pull/36038#issuecomment-1086981229 Yeah, actually that's what I was going to point out. Should be better to create a separate PR to improve the documentation for both sides :-). -- This is an automated message from t

[GitHub] [spark] sunchao commented on pull request #34659: [SPARK-34863][SQL] Support complex types for Parquet vectorized reader

2022-04-03 Thread GitBox

sunchao commented on PR #34659: URL: https://github.com/apache/spark/pull/34659#issuecomment-1086971364 Thanks all for the review!!! @viirya I just opened #36055 for the follow-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [spark] sunchao opened a new pull request, #36055: [SPARK-34863][SQL][FOLLOWUP] Disable `spark.sql.parquet.enableNestedColumnVectorizedReader` by default

2022-04-03 Thread GitBox

sunchao opened a new pull request, #36055: URL: https://github.com/apache/spark/pull/36055 ### What changes were proposed in this pull request? This PR disables `spark.sql.parquet.enableNestedColumnVectorizedReader` by default. ### Why are the changes needed?

[GitHub] [spark] HeartSaVioR commented on pull request #36038: [WIP][SPARK-38759][PYTHON][SS] Add StreamingQueryListener support in PySpark

2022-04-03 Thread GitBox

HeartSaVioR commented on PR #36038: URL: https://github.com/apache/spark/pull/36038#issuecomment-1086968356 I see review comments about the doc which seem to be just copied from Scala/Java API doc. Since this PR focuses mainly to deal with feature parity, how about simply allowing co

[GitHub] [spark] sunchao commented on pull request #35657: [SPARK-37377][SQL] Initial implementation of Storage-Partitioned Join

2022-04-03 Thread GitBox

sunchao commented on PR #35657: URL: https://github.com/apache/spark/pull/35657#issuecomment-1086963569 Thanks @dongjoon-hyun , updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] huaxingao commented on pull request #36039: [SPARK-38761][SQL] DS V2 supports push down misc non-aggregate functions

2022-04-03 Thread GitBox

huaxingao commented on PR #36039: URL: https://github.com/apache/spark/pull/36039#issuecomment-1086961617 I have a general question: what are the criteria of the functions that can be pushed down to data source? -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [spark] huaxingao commented on pull request #36050: [SPARK-38779] [SQL][Tests] Unify the pushed operator checking between FileSource test suite and JDBCV2Suite

2022-04-03 Thread GitBox

huaxingao commented on PR #36050: URL: https://github.com/apache/spark/pull/36050#issuecomment-1086961002 @dongjoon-hyun I created Spark-38779 for this. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] sigmod commented on pull request #36047: [SPARK-32268][SQL][FOLLOWUP] Add ColumnPruning in injectBloomFilter

2022-04-03 Thread GitBox

sigmod commented on PR #36047: URL: https://github.com/apache/spark/pull/36047#issuecomment-1086954891 LGTM. Can we merge it to branch-3.3 as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] dongjoon-hyun closed pull request #36054: [SPARK-38776][MLLIB][TESTS][FOLLOWUP] Disable ANSI_ENABLED more for `Out of Range` failures

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #36054: [SPARK-38776][MLLIB][TESTS][FOLLOWUP] Disable ANSI_ENABLED more for `Out of Range` failures URL: https://github.com/apache/spark/pull/36054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #36054: [SPARK-38776][MLLIB][TESTS][FOLLOWUP] Disable ANSI_ENABLED more for `Out of Range` failures

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #36054: URL: https://github.com/apache/spark/pull/36054#issuecomment-1086943631 Thank you, @srowen . This is a single test suite only change, and I verified in two ways. Merged to master/3.3. ``` SPARK_ANSI_SQL_MODE=true build/sbt "mllib/testOnly *.ALS

[GitHub] [spark] dongjoon-hyun commented on pull request #36054: [SPARK-38776][MLLIB][TESTS][FOLLOWUP] Disable ANSI_ENABLED more for `Out of Range` failures

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #36054: URL: https://github.com/apache/spark/pull/36054#issuecomment-1086940137 cc @gengliangwang , @srowen , @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in `ALSSuite`

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #36051: URL: https://github.com/apache/spark/pull/36051#issuecomment-1086940092 Here is the follow-up. - https://github.com/apache/spark/pull/36054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36054: [SPARK-38776][MLLIB][TESTS][FOLLOWUP] Disable ANSI_ENABLED more for `Out of Range` failures

2022-04-03 Thread GitBox

dongjoon-hyun opened a new pull request, #36054: URL: https://github.com/apache/spark/pull/36054 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] dongjoon-hyun commented on pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in `ALSSuite`

2022-04-03 Thread GitBox

dongjoon-hyun commented on PR #36051: URL: https://github.com/apache/spark/pull/36051#issuecomment-1086939224 Oops. I realized that more `OutOfRange` failures were hidden in the same test case behind the previous `Overflow` failure. I'll make a follow-up soon. -- This is an automated mess

[GitHub] [spark] xinrong-databricks commented on pull request #36048: [SPARK-38774][PYTHON] Implement Series.autocorr

2022-04-03 Thread GitBox

xinrong-databricks commented on pull request #36048: URL: https://github.com/apache/spark/pull/36048#issuecomment-1086927554 Thanks @zhengruifeng! https://github.com/apache/spark/blob/master/python/pyspark/pandas/tests/test_series.py is a good place to add tests. It would be

[GitHub] [spark] xinrong-databricks commented on a change in pull request #36006: [SPARK-38686][PYTHON] Implement `keep` parameter of `(Index/MultiIndex).drop_duplicates`

2022-04-03 Thread GitBox

xinrong-databricks commented on a change in pull request #36006: URL: https://github.com/apache/spark/pull/36006#discussion_r841262157 ## File path: python/pyspark/pandas/indexes/multi.py ## @@ -893,6 +893,70 @@ def drop(self, codes: List[Any], level: Optional[Union[int, Name]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35991: [SPARK-38675][CORE] Fix race during unlock in BlockInfoManager

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #35991: URL: https://github.com/apache/spark/pull/35991#discussion_r841258213 ## File path: core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala ## @@ -360,12 +360,17 @@ private[storage] class BlockInfoManager e

[GitHub] [spark] dongjoon-hyun commented on pull request #35979: [SPARK-38664][CORE] Support compact EventLog when there are illegal characters in the path

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #35979: URL: https://github.com/apache/spark/pull/35979#issuecomment-1086919965 Back to the original proposal, why do we need to support `illegal char`, @lw33 ? It's illegal, isn't it? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on pull request #36033: [SPARK-38754][SQL][TEST][3.1] Using EquivalentExpressions getEquivalentExprs function instead of getExprState at SubexpressionEliminati

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #36033: URL: https://github.com/apache/spark/pull/36033#issuecomment-1086918872 Originally, `branch-3.1` was broken, but `branch-3.2` wasn't. Given that, the forward-port from 3.1 to 3.2 looks wrong to me. I'm going to revert this from `branch-3.

[GitHub] [spark] dongjoon-hyun commented on pull request #36033: [SPARK-38754][SQL][TEST][3.1] Using EquivalentExpressions getEquivalentExprs function instead of getExprState at SubexpressionEliminati

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #36033: URL: https://github.com/apache/spark/pull/36033#issuecomment-1086917353 Hi, @cloud-fan . This seems to break branch-3.2 compilation. ``` [error] /home/runner/work/spark/spark/sql/catalyst/src/test/scala/org/apache/spark/sql/cataly

[GitHub] [spark] dongjoon-hyun commented on pull request #35886: [SPARK-38582][K8S] Introduce buildEnvVars and buildEnvVarsWithFieldRef for KubernetesUtils to eliminate duplicate code pattern

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #35886: URL: https://github.com/apache/spark/pull/35886#issuecomment-1086916633 FYI, we had better hold on these kind of PRs during the planned release process. It's the same for the other refactoring PRs. - https://github.com/apache/spark/pull/360

[GitHub] [spark] yaooqinn commented on pull request #35765: [SPARK-38446][Core] Fix deadlock between ExecutorClassLoader and FileDownloadCallback caused by Log4j

2022-04-03 Thread GitBox

yaooqinn commented on pull request #35765: URL: https://github.com/apache/spark/pull/35765#issuecomment-1086914484 thanks @dongjoon-hyun and all -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] awdavidson commented on a change in pull request #36048: [SPARK-38774][PYTHON] Implement Series.autocorr

2022-04-03 Thread GitBox

awdavidson commented on a change in pull request #36048: URL: https://github.com/apache/spark/pull/36048#discussion_r841254014 ## File path: python/pyspark/pandas/series.py ## @@ -2937,6 +2937,73 @@ def add_suffix(self, suffix: str) -> "Series": DataFrame(internal.

[GitHub] [spark] dongjoon-hyun closed pull request #35765: [SPARK-38446][Core] Fix deadlock between ExecutorClassLoader and FileDownloadCallback caused by Log4j

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #35765: URL: https://github.com/apache/spark/pull/35765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-

[GitHub] [spark] dongjoon-hyun closed pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in `ALSSuite`

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #36051: URL: https://github.com/apache/spark/pull/36051 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-

[GitHub] [spark] dongjoon-hyun commented on pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in `ALSSuite`

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #36051: URL: https://github.com/apache/spark/pull/36051#issuecomment-1086909995 Thank you, @gengliangwang , @srowen , @yaooqinn . Merged to master/3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [spark] yaooqinn opened a new pull request #36053: [SPARK-38778][INFRA][BUILD] Replace http with https for project url in pom

2022-04-03 Thread GitBox

yaooqinn opened a new pull request #36053: URL: https://github.com/apache/spark/pull/36053 ### What changes were proposed in this pull request? change http://spark.apache.org/ to https://spark.apache.org/ in the project URL of all pom files ### Why are the chan

[GitHub] [spark] AmplabJenkins commented on pull request #36047: [SPARK-32268][SQL][FOLLOWUP] Add ColumnPruning in injectBloomFilter

2022-04-03 Thread GitBox

AmplabJenkins commented on pull request #36047: URL: https://github.com/apache/spark/pull/36047#issuecomment-1086901189 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] yaooqinn opened a new pull request #36052: [SPARK-38777][YARN] Add `bin/spark-submit --kill / --status` support for yarn

2022-04-03 Thread GitBox

yaooqinn opened a new pull request #36052: URL: https://github.com/apache/spark/pull/36052 ### What changes were proposed in this pull request? In this PR, we extend the `bin/spark-submit` to make it support ` --kill / --status` cli options for yarn cluster manager, whic

[GitHub] [spark] srowen commented on a change in pull request #36049: [SPARK-38775][ML] cleanup validation functions

2022-04-03 Thread GitBox

srowen commented on a change in pull request #36049: URL: https://github.com/apache/spark/pull/36049#discussion_r841241230 ## File path: mllib/src/main/scala/org/apache/spark/ml/util/DatasetUtils.scala ## @@ -138,4 +140,61 @@ private[spark] object DatasetUtils { case Row

[GitHub] [spark] gengliangwang commented on pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in `ALSSuite`

2022-04-03 Thread GitBox

gengliangwang commented on pull request #36051: URL: https://github.com/apache/spark/pull/36051#issuecomment-1086883386 @dongjoon-hyun thanks for fixing it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [spark] dcoliversun commented on pull request #36044: [SPARK-38770][K8S] Remove `renameMainAppResource` from `baseDriverContainer`

2022-04-03 Thread GitBox

dcoliversun commented on pull request #36044: URL: https://github.com/apache/spark/pull/36044#issuecomment-1086847636 😁 Thanks for your help @dongjoon-hyun @martin-g -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [spark] lw33 commented on pull request #35979: [SPARK-38664][CORE] Support compact EventLog when there are illegal characters in the path

2022-04-03 Thread GitBox

lw33 commented on pull request #35979: URL: https://github.com/apache/spark/pull/35979#issuecomment-1086831394 > Sorry guys. Supporting `illegal char` by removing `toURI` doesn't look like a safe improvement to me. > > Given the trade-off between benefit and risk, we had better recom

[GitHub] [spark] sarutak commented on pull request #35443: [MINOR][CORE] Change the log level to WARN for the message which is shown in case users attemp to add a JAR twice

2022-04-03 Thread GitBox

sarutak commented on pull request #35443: URL: https://github.com/apache/spark/pull/35443#issuecomment-1086831154 @dongjoon-hyun Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [spark] AngersZhuuuu commented on pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-04-03 Thread GitBox

AngersZh commented on pull request #35799: URL: https://github.com/apache/spark/pull/35799#issuecomment-1086820418 @dongjoon-hyun Build failed but seems not related to this pr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [spark] AngersZhuuuu commented on pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-04-03 Thread GitBox

AngersZh commented on pull request #35799: URL: https://github.com/apache/spark/pull/35799#issuecomment-1086800839 > I meant this, [#35799 (comment)](https://github.com/apache/spark/pull/35799#discussion_r841166511) . :) > > > Didn't got your point about revise the code for Apac

[GitHub] [spark] peter-toth commented on pull request #35382: [SPARK-28090][SQL] Improve `replaceAliasButKeepName` performance

2022-04-03 Thread GitBox

peter-toth commented on pull request #35382: URL: https://github.com/apache/spark/pull/35382#issuecomment-1086800126 Thanks @cloud-fan, @dongjoon-hyun for the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [spark] kianelbo closed pull request #35977: [SPARK-38660][PYTHON] PySpark DeprecationWarning: distutils Version classes are deprecated

2022-04-03 Thread GitBox

kianelbo closed pull request #35977: URL: https://github.com/apache/spark/pull/35977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsub

[GitHub] [spark] dongjoon-hyun commented on pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #35799: URL: https://github.com/apache/spark/pull/35799#issuecomment-1086799205 I meant this, https://github.com/apache/spark/pull/35799#discussion_r841166511 . :) > Didn't got your point about revise the code for Apache Spark 3.4. -- This is an

[GitHub] [spark] dongjoon-hyun commented on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-1086798884 Shall we close this if all tests are completed, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #35290: [SPARK-37865][SQL][3.0]Fix union bug when the first child of union has duplicate columns

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #35290: URL: https://github.com/apache/spark/pull/35290#issuecomment-1086798303 Hi, @chasingegg . Thank you for making a PR. However, Apache Spark 3.0.0 was released on June 18, 2020. According to [Apache Spark Versioning Policy](https://spark.apache

[GitHub] [spark] dongjoon-hyun closed pull request #35443: [MINOR][CORE] Change the log level to WARN for the message which is shown in case users attemp to add a JAR twice

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #35443: URL: https://github.com/apache/spark/pull/35443 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-

[GitHub] [spark] dongjoon-hyun closed pull request #35382: [SPARK-28090][SQL] Improve `replaceAliasButKeepName` performance

2022-04-03 Thread GitBox

dongjoon-hyun closed pull request #35382: URL: https://github.com/apache/spark/pull/35382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-

[GitHub] [spark] AngersZhuuuu commented on pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-04-03 Thread GitBox

AngersZh commented on pull request #35799: URL: https://github.com/apache/spark/pull/35799#issuecomment-1086796545 > In general, +1 for the requirement and idea, @AngersZh . Shall we revise the code for Apache Spark 3.4? Didn't got your point about `revise the code for Apache

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #35799: [SPARK-38498][STREAM] Support customized StreamingListener by configuration

2022-04-03 Thread GitBox

AngersZh commented on a change in pull request #35799: URL: https://github.com/apache/spark/pull/35799#discussion_r841178986 ## File path: streaming/src/test/scala/org/apache/spark/streaming/StreamingListenerSuite.scala ## @@ -386,3 +400,14 @@ class StreamingContextStoppin

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841176375 ## File path: sql/core/src/main/scala/org/apache/spark/sql/SparkSessionExtensions.scala ## @@ -294,4 +295,23 @@ class SparkSessionExtensions { d

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841174227 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveRulesHolder.scala ## @@ -0,0 +1,30 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841174107 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -83,7 +83,7 @@ case class AdaptiveS

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841174107 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -83,7 +83,7 @@ case class AdaptiveS

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841173889 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala ## @@ -28,7 +29,9 @@ import org.apache.spark.util.

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #36011: [SPARK-38697][SQL] Extend SparkSessionExtensions to inject rules into AQE Optimizer

2022-04-03 Thread GitBox

dongjoon-hyun commented on a change in pull request #36011: URL: https://github.com/apache/spark/pull/36011#discussion_r841173889 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala ## @@ -28,7 +29,9 @@ import org.apache.spark.util.

[GitHub] [spark] dongjoon-hyun commented on pull request #36051: [SPARK-38776][MLLIB][TESTS] Disable ANSI_ENABLED explicitly in 'ALS validate input dataset' test case

2022-04-03 Thread GitBox

dongjoon-hyun commented on pull request #36051: URL: https://github.com/apache/spark/pull/36051#issuecomment-1086791099 cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

81 matches

Mail list logo