[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646976#comment-17646976 ] Apache Spark commented on SPARK-41506: -- User 'zhengruifeng' has created a pull requ

[jira] [Comment Edited] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646971#comment-17646971 ] wuyi edited comment on SPARK-41497 at 12/14/22 7:31 AM: I'm thin

[jira] [Commented] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646973#comment-17646973 ] Ohad Raviv commented on SPARK-41510: ok.. after diving into the code I think I found

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread Mridul Muralidharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646972#comment-17646972 ] Mridul Muralidharan commented on SPARK-41497: - [~Ngone51] Agree, that is wha

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646971#comment-17646971 ] wuyi commented on SPARK-41497: -- I'm thinking if we could improve the improved Option 4 by c

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646969#comment-17646969 ] wuyi commented on SPARK-41497: -- > do we have a way to do that ?   [~mridulm80]  Currently

[jira] [Comment Edited] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Jayadeep Jayaraman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646963#comment-17646963 ] Jayadeep Jayaraman edited comment on SPARK-38719 at 12/14/22 6:56 AM:

[jira] [Comment Edited] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Jayadeep Jayaraman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646963#comment-17646963 ] Jayadeep Jayaraman edited comment on SPARK-38719 at 12/14/22 6:56 AM:

[jira] [Comment Edited] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Jayadeep Jayaraman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646963#comment-17646963 ] Jayadeep Jayaraman edited comment on SPARK-38719 at 12/14/22 6:55 AM:

[jira] [Commented] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Jayadeep Jayaraman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646963#comment-17646963 ] Jayadeep Jayaraman commented on SPARK-38719: I tried creating the failure as

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread Mridul Muralidharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646962#comment-17646962 ] Mridul Muralidharan commented on SPARK-41497: - Agree, if we can determine th

[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646960#comment-17646960 ] Apache Spark commented on SPARK-41506: -- User 'dengziming' has created a pull reques

[jira] [Updated] (SPARK-41515) PVC-oriented executor pod allocation

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-41515: -- Labels: releasenotes (was: ) > PVC-oriented executor pod allocation > ---

[jira] [Updated] (SPARK-39324) Log ExecutorDecommission as INFO level in TaskSchedulerImpl

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39324: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Improvement) > Log ExecutorDecommissio

[jira] [Updated] (SPARK-39450) Reuse PVCs by default

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39450: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Improvement) > Reuse PVCs by default >

[jira] [Updated] (SPARK-39688) getReusablePVCs should handle accounts with no PVC permission

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39688: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Bug) > getReusablePVCs should handle a

[jira] [Updated] (SPARK-39898) Upgrade kubernetes-client to 5.12.3

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39898: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Bug) > Upgrade kubernetes-client to 5.

[jira] [Updated] (SPARK-39846) Enable spark.dynamicAllocation.shuffleTracking.enabled by default

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39846: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Improvement) > Enable spark.dynamicAll

[jira] [Commented] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646942#comment-17646942 ] Max Gekk commented on SPARK-38719: -- [~jjayadeep] Sure, go ahead. > Test the error clas

[jira] [Updated] (SPARK-39965) Skip PVC cleanup when driver doesn't own PVCs

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-39965: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Bug) > Skip PVC cleanup when driver do

[jira] [Updated] (SPARK-40198) Enable spark.storage.decommission.(rdd|shuffle)Blocks.enabled by default

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40198: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Improvement) > Enable spark.storage.de

[jira] [Updated] (SPARK-40304) Add decomTestTag to K8s Integration Test

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40304: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Test) > Add decomTestTag to K8s Integr

[jira] [Updated] (SPARK-40459) recoverDiskStore should not stop by existing recomputed files

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-40459: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Bug) > recoverDiskStore should not sto

[jira] [Updated] (SPARK-41388) getReusablePVCs should ignore recently created PVCs in the previous batch

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-41388: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Bug) > getReusablePVCs should ignore r

[jira] [Assigned] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-41514: - Assignee: Dongjoon Hyun > Add `PVC-oriented executor pod allocation` section and revise

[jira] [Updated] (SPARK-41410) Support PVC-oriented executor pod allocation

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-41410: -- Parent: SPARK-41515 Issue Type: Sub-task (was: New Feature) > Support PVC-oriented ex

[jira] [Updated] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-41514: -- Parent: SPARK-41515 Issue Type: Sub-task (was: Documentation) > Add `PVC-oriented exe

[jira] [Created] (SPARK-41515) PVC-oriented executor pod allocation

2022-12-13 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-41515: - Summary: PVC-oriented executor pod allocation Key: SPARK-41515 URL: https://issues.apache.org/jira/browse/SPARK-41515 Project: Spark Issue Type: New Featur

[jira] [Commented] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE

2022-12-13 Thread Jayadeep Jayaraman (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646938#comment-17646938 ] Jayadeep Jayaraman commented on SPARK-38719: Hi [~maxgekk] - I would like to

[jira] [Assigned] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41248: Assignee: Ivan Sadikov > Add config flag to control before of JSON partial results parsing in >

[jira] [Resolved] (SPARK-41248) Add config flag to control before of JSON partial results parsing in SPARK-40646

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41248. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38784 [https://github.com

[jira] [Assigned] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41514: Assignee: Apache Spark > Add `PVC-oriented executor pod allocation` section and revise co

[jira] [Assigned] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41514: Assignee: (was: Apache Spark) > Add `PVC-oriented executor pod allocation` section an

[jira] [Commented] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646931#comment-17646931 ] Apache Spark commented on SPARK-41514: -- User 'dongjoon-hyun' has created a pull req

[jira] [Created] (SPARK-41514) Add `PVC-oriented executor pod allocation` section and revise config name

2022-12-13 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-41514: - Summary: Add `PVC-oriented executor pod allocation` section and revise config name Key: SPARK-41514 URL: https://issues.apache.org/jira/browse/SPARK-41514 Project:

[jira] [Assigned] (SPARK-41409) Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1043`

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41409: Assignee: Yang Jie > Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1043` > --

[jira] [Resolved] (SPARK-41409) Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1043`

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41409. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38940 [https://github.com

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646919#comment-17646919 ] wuyi commented on SPARK-41497: -- [~mridulm80]  For b) and c), shouldn't we allow T2 to use t

[jira] [Updated] (SPARK-41512) Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted)

2022-12-13 Thread Rui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41512: - Description: h3. Problem Statement In current Spark optimizer, a single partition shuffle might be crea

[jira] [Assigned] (SPARK-41513) Implement a Accumulator to collect per mapper row count metrics

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41513: Assignee: Rui Wang (was: Apache Spark) > Implement a Accumulator to collect per mapper r

[jira] [Assigned] (SPARK-41513) Implement a Accumulator to collect per mapper row count metrics

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41513: Assignee: Apache Spark (was: Rui Wang) > Implement a Accumulator to collect per mapper r

[jira] [Commented] (SPARK-41513) Implement a Accumulator to collect per mapper row count metrics

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646901#comment-17646901 ] Apache Spark commented on SPARK-41513: -- User 'amaliujia' has created a pull request

[jira] [Updated] (SPARK-41512) Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted)

2022-12-13 Thread Rui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41512: - Description: h3. Problem Statement In current Spark optimizer, a single partition shuffle might be crea

[jira] [Updated] (SPARK-41512) Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted)

2022-12-13 Thread Rui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated SPARK-41512: - Description: h3. Problem Statement In current Spark optimizer, a single partition shuffle might be crea

[jira] [Created] (SPARK-41513) Implement a Accumulator to collect per mapper row count metrics

2022-12-13 Thread Rui Wang (Jira)
Rui Wang created SPARK-41513: Summary: Implement a Accumulator to collect per mapper row count metrics Key: SPARK-41513 URL: https://issues.apache.org/jira/browse/SPARK-41513 Project: Spark Issu

[jira] [Commented] (SPARK-41512) Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted)

2022-12-13 Thread Rui Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646898#comment-17646898 ] Rui Wang commented on SPARK-41512: -- cc [~cloud_fan] > Row count based shuffle read to

[jira] [Created] (SPARK-41512) Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted)

2022-12-13 Thread Rui Wang (Jira)
Rui Wang created SPARK-41512: Summary: Row count based shuffle read to optimize global limit after a single partition shuffle (optionally with input partition sorted) Key: SPARK-41512 URL: https://issues.apache.org/ji

[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646896#comment-17646896 ] Apache Spark commented on SPARK-41506: -- User 'zhengruifeng' has created a pull requ

[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646895#comment-17646895 ] Apache Spark commented on SPARK-41506: -- User 'zhengruifeng' has created a pull requ

[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646890#comment-17646890 ] Apache Spark commented on SPARK-41506: -- User 'HyukjinKwon' has created a pull reque

[jira] [Commented] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646889#comment-17646889 ] Apache Spark commented on SPARK-41506: -- User 'HyukjinKwon' has created a pull reque

[jira] [Assigned] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-41506: Assignee: Ruifeng Zheng > Refactor LiteralExpression to support DataType > --

[jira] [Resolved] (SPARK-41506) Refactor LiteralExpression to support DataType

2022-12-13 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-41506. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39047 [https://gi

[jira] [Comment Edited] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread Mridul Muralidharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646825#comment-17646825 ] Mridul Muralidharan edited comment on SPARK-41497 at 12/13/22 9:20 PM: ---

[jira] [Comment Edited] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread Mridul Muralidharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646825#comment-17646825 ] Mridul Muralidharan edited comment on SPARK-41497 at 12/13/22 9:20 PM: ---

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread Mridul Muralidharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646825#comment-17646825 ] Mridul Muralidharan commented on SPARK-41497: - > For example, a task is cons

[jira] [Commented] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646823#comment-17646823 ] Apache Spark commented on SPARK-27561: -- User 'gengliangwang' has created a pull req

[jira] [Commented] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646824#comment-17646824 ] Apache Spark commented on SPARK-27561: -- User 'gengliangwang' has created a pull req

[jira] [Assigned] (SPARK-41062) Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41062: Assignee: Haejoon Lee > Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE > ---

[jira] [Resolved] (SPARK-41062) Rename UNSUPPORTED_CORRELATED_REFERENCE to CORRELATED_REFERENCE

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41062. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38576 [https://github.com

[jira] [Resolved] (SPARK-41482) Upgrade dropwizard metrics 4.2.13

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-41482. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39026 [https://

[jira] [Assigned] (SPARK-41482) Upgrade dropwizard metrics 4.2.13

2022-12-13 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-41482: - Assignee: Yang Jie > Upgrade dropwizard metrics 4.2.13 > --

[jira] [Commented] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646734#comment-17646734 ] Ohad Raviv commented on SPARK-41510: the conda solution is more for a "static" packa

[jira] [Assigned] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2022-12-13 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27561: --- Assignee: Xinyi Yu > Support "lateral column alias references" to allow column aliases to b

[jira] [Resolved] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2022-12-13 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27561. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38776 [https://gith

[jira] [Commented] (SPARK-39601) AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646682#comment-17646682 ] Apache Spark commented on SPARK-39601: -- User 'pan3793' has created a pull request f

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread huangtengfei (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646658#comment-17646658 ] huangtengfei commented on SPARK-41497: -- I also think that option3/4(include the imp

[jira] [Resolved] (SPARK-39601) AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-12-13 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-39601. --- Fix Version/s: 3.4.0 Assignee: Cheng Pan Resolution: Fixed > AllocationFailu

[jira] [Resolved] (SPARK-41478) Assign a name to the error class _LEGACY_ERROR_TEMP_1234

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41478. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 39018 [https://github.com

[jira] [Assigned] (SPARK-41478) Assign a name to the error class _LEGACY_ERROR_TEMP_1234

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41478: Assignee: BingKun Pan > Assign a name to the error class _LEGACY_ERROR_TEMP_1234 > --

[jira] [Commented] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646639#comment-17646639 ] wuyi commented on SPARK-41497: -- [~mridulm80] Sounds like a better idea than option 4. But I

[jira] [Updated] (SPARK-41497) Accumulator undercounting in the case of retry task with rdd cache

2022-12-13 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi updated SPARK-41497: - Description: Accumulator could be undercounted when the retried task has rdd cache.  See the example below and

[jira] [Commented] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646620#comment-17646620 ] Hyukjin Kwon commented on SPARK-41510: -- What about using Conda ([https://www.datab

[jira] [Comment Edited] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646606#comment-17646606 ] Ohad Raviv edited comment on SPARK-41510 at 12/13/22 12:23 PM: ---

[jira] [Commented] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646606#comment-17646606 ] Ohad Raviv commented on SPARK-41510: [~hvanhovell] - can you please refer that to so

[jira] [Commented] (SPARK-41360) Avoid BlockManager re-registration if the executor has been lost

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646605#comment-17646605 ] Apache Spark commented on SPARK-41360: -- User 'HyukjinKwon' has created a pull reque

[jira] [Commented] (SPARK-41360) Avoid BlockManager re-registration if the executor has been lost

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646603#comment-17646603 ] Apache Spark commented on SPARK-41360: -- User 'HyukjinKwon' has created a pull reque

[jira] [Assigned] (SPARK-41511) LongToUnsafeRowMap support ignoresDuplicatedKey

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41511: Assignee: (was: Apache Spark) > LongToUnsafeRowMap support ignoresDuplicatedKey > ---

[jira] [Updated] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ohad Raviv updated SPARK-41510: --- Description: When working interactively with Spark through notebooks in various envs - Databricks/Y

[jira] [Commented] (SPARK-41511) LongToUnsafeRowMap support ignoresDuplicatedKey

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646600#comment-17646600 ] Apache Spark commented on SPARK-41511: -- User 'ulysses-you' has created a pull reque

[jira] [Assigned] (SPARK-41511) LongToUnsafeRowMap support ignoresDuplicatedKey

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41511: Assignee: Apache Spark > LongToUnsafeRowMap support ignoresDuplicatedKey > --

[jira] [Created] (SPARK-41510) Support easy way for user defined PYTHONPATH in workers

2022-12-13 Thread Ohad Raviv (Jira)
Ohad Raviv created SPARK-41510: -- Summary: Support easy way for user defined PYTHONPATH in workers Key: SPARK-41510 URL: https://issues.apache.org/jira/browse/SPARK-41510 Project: Spark Issue Typ

[jira] [Created] (SPARK-41511) LongToUnsafeRowMap support ignoresDuplicatedKey

2022-12-13 Thread XiDuo You (Jira)
XiDuo You created SPARK-41511: - Summary: LongToUnsafeRowMap support ignoresDuplicatedKey Key: SPARK-41511 URL: https://issues.apache.org/jira/browse/SPARK-41511 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-41412) Implement `Cast`

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646594#comment-17646594 ] Apache Spark commented on SPARK-41412: -- User 'HyukjinKwon' has created a pull reque

[jira] [Commented] (SPARK-41412) Implement `Cast`

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646593#comment-17646593 ] Apache Spark commented on SPARK-41412: -- User 'HyukjinKwon' has created a pull reque

[jira] [Updated] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread jiaan.geng (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiaan.geng updated SPARK-41509: --- Description: Currently, Spark runtime filter supports bloom filter and in subquery filter. The in su

[jira] [Commented] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646581#comment-17646581 ] Apache Spark commented on SPARK-41509: -- User 'beliefer' has created a pull request

[jira] [Assigned] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41509: Assignee: (was: Apache Spark) > Delay execution hash until after aggregation for semi

[jira] [Assigned] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41509: Assignee: Apache Spark > Delay execution hash until after aggregation for semi-join runti

[jira] [Commented] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646579#comment-17646579 ] Apache Spark commented on SPARK-41509: -- User 'beliefer' has created a pull request

[jira] [Created] (SPARK-41509) Delay execution hash until after aggregation for semi-join runtime filter.

2022-12-13 Thread jiaan.geng (Jira)
jiaan.geng created SPARK-41509: -- Summary: Delay execution hash until after aggregation for semi-join runtime filter. Key: SPARK-41509 URL: https://issues.apache.org/jira/browse/SPARK-41509 Project: Spark

[jira] [Commented] (SPARK-41319) when-otherwise support

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646575#comment-17646575 ] Apache Spark commented on SPARK-41319: -- User 'zhengruifeng' has created a pull requ

[jira] [Commented] (SPARK-41319) when-otherwise support

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646574#comment-17646574 ] Apache Spark commented on SPARK-41319: -- User 'zhengruifeng' has created a pull requ

[jira] [Created] (SPARK-41508) assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable

2022-12-13 Thread Yang Jie (Jira)
Yang Jie created SPARK-41508: Summary: assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable Key: SPARK-41508 URL: https://issues.apache.org/jira/browse/SPARK-41508 Project: Spark

[jira] [Updated] (SPARK-41508) Assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable

2022-12-13 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-41508: - Summary: Assign name to _LEGACY_ERROR_TEMP_1179 and unwrap the existing SparkThrowable (was: assign n

[jira] [Assigned] (SPARK-41406) Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-41406: Assignee: BingKun Pan > Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic

[jira] [Resolved] (SPARK-41406) Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic

2022-12-13 Thread Max Gekk (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-41406. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 38937 [https://github.com

[jira] [Commented] (SPARK-41507) Correct group of collection_funcs

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646509#comment-17646509 ] Apache Spark commented on SPARK-41507: -- User 'LuciferYang' has created a pull reque

[jira] [Commented] (SPARK-41424) Protobuf serializer for TaskDataWrapper

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646508#comment-17646508 ] Apache Spark commented on SPARK-41424: -- User 'gengliangwang' has created a pull req

[jira] [Assigned] (SPARK-41507) Correct group of collection_funcs

2022-12-13 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-41507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-41507: Assignee: (was: Apache Spark) > Correct group of collection_funcs > -

  1   2   >