[jira] [Created] (SPARK-48150) Fix nullability of try_parse_json

2024-05-06 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-48150: -- Summary: Fix nullability of try_parse_json Key: SPARK-48150 URL: https://issues.apache.org/jira/browse/SPARK-48150 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-48128) BitwiseCount / bit_count generated code for boolean inputs fails to compile

2024-05-03 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-48128: -- Summary: BitwiseCount / bit_count generated code for boolean inputs fails to compile Key: SPARK-48128 URL: https://issues.apache.org/jira/browse/SPARK-48128 Project:

[jira] [Updated] (SPARK-48128) BitwiseCount / bit_count generated code for boolean inputs fails to compile

2024-05-03 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-48128: --- Description: If the `BitwiseCount` / `bit_count` expresison is applied to a boolean type column

[jira] [Updated] (SPARK-48081) Fix ClassCastException in NTile.checkInputDataTypes() when argument is non-foldable or of wrong type

2024-05-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-48081: --- Summary: Fix ClassCastException in NTile.checkInputDataTypes() when argument is non-foldable or of

[jira] [Updated] (SPARK-48081) Fix ClassCastException in NTile.checkInputDataTypes() when input data type is mismatched or non-foldable

2024-05-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-48081: --- Summary: Fix ClassCastException in NTile.checkInputDataTypes() when input data type is mismatched

[jira] [Updated] (SPARK-48081) Fix ClassCastException in NTile.checkInputDataTypes() when data type is mismatched

2024-05-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-48081: --- Summary: Fix ClassCastException in NTile.checkInputDataTypes() when data type is mismatched (was:

[jira] [Created] (SPARK-48081) Fix ClassCastException in NTile.checkInputDataTypes()

2024-05-02 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-48081: -- Summary: Fix ClassCastException in NTile.checkInputDataTypes() Key: SPARK-48081 URL: https://issues.apache.org/jira/browse/SPARK-48081 Project: Spark Issue

[jira] [Created] (SPARK-47734) Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query

2024-04-04 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-47734: -- Summary: Fix flaky pyspark.sql.dataframe.DataFrame.writeStream doctest by stopping streaming query Key: SPARK-47734 URL: https://issues.apache.org/jira/browse/SPARK-47734

[jira] [Resolved] (SPARK-47068) Recover -1 and 0 case for spark.sql.execution.arrow.maxRecordsPerBatch

2024-04-01 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-47068. Resolution: Fixed Marking this issue as fixed. > Recover -1 and 0 case for

[jira] [Comment Edited] (SPARK-46251) Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast null into None for Option values

2024-03-13 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826945#comment-17826945 ] Josh Rosen edited comment on SPARK-46251 at 3/14/24 5:34 AM: - FYI, this

[jira] [Commented] (SPARK-46251) Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast null into None for Option values

2024-03-13 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826945#comment-17826945 ] Josh Rosen commented on SPARK-46251: FYI, this looks like a duplicate of

[jira] [Created] (SPARK-47121) Avoid noisy RejectedExecutionExceptions during StandaloneSchedulerBackend shutdown

2024-02-21 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-47121: -- Summary: Avoid noisy RejectedExecutionExceptions during StandaloneSchedulerBackend shutdown Key: SPARK-47121 URL: https://issues.apache.org/jira/browse/SPARK-47121

[jira] [Updated] (SPARK-46862) Incorrect count() of a dataframe loaded from CSV datasource

2024-01-31 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-46862: --- Labels: correctness pull-request-available (was: pull-request-available) > Incorrect count() of a

[jira] [Commented] (SPARK-46365) Spark 3.5.0 Regression: Window Function Combination Yields Null Values

2023-12-13 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796054#comment-17796054 ] Josh Rosen commented on SPARK-46365: I think that this is a duplicate of SPARK-45543, which is fixed

[jira] [Commented] (SPARK-46365) Spark 3.5.0 Regression: Window Function Combination Yields Null Values

2023-12-12 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795932#comment-17795932 ] Josh Rosen commented on SPARK-46365: This issue appears to be fixed in

[jira] [Commented] (SPARK-46125) Memory leak when using createDataFrame with persist

2023-11-28 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790751#comment-17790751 ] Josh Rosen commented on SPARK-46125: I think that this issue relates specifically to

[jira] [Updated] (SPARK-46125) Memory leak when using createDataFrame with persist

2023-11-28 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-46125: --- Attachment: image-2023-11-28-12-55-58-461.png > Memory leak when using createDataFrame with persist

[jira] [Commented] (SPARK-46105) df.emptyDataFrame shows 1 if we repartition(1) in Spark 3.3.x and above

2023-11-28 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790742#comment-17790742 ] Josh Rosen commented on SPARK-46105: {quote}The reason for raising this as a bug is I have a

[jira] [Updated] (SPARK-44641) SPJ: Results duplicated when SPJ partial-cluster and pushdown enabled but conditions unmet

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-44641: --- Labels: correctness (was: ) > SPJ: Results duplicated when SPJ partial-cluster and pushdown

[jira] [Updated] (SPARK-42134) Fix getPartitionFiltersAndDataFilters() to handle filters without referenced attributes

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-42134: --- Labels: correctness (was: ) > Fix getPartitionFiltersAndDataFilters() to handle filters without

[jira] [Updated] (SPARK-43760) Incorrect attribute nullability after RewriteCorrelatedScalarSubquery leads to incorrect query results

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-43760: --- Labels: correctness (was: ) > Incorrect attribute nullability after

[jira] [Updated] (SPARK-44448) Wrong results for dense_rank() <= k from InferWindowGroupLimit and DenseRankLimitIterator

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-8: --- Labels: correctness (was: ) > Wrong results for dense_rank() <= k from InferWindowGroupLimit and

[jira] [Updated] (SPARK-45920) group by ordinal should be idempotent

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45920: --- Labels: correctness pull-request-available (was: pull-request-available) > group by ordinal should

[jira] [Updated] (SPARK-45507) Correctness bug in correlated scalar subqueries with COUNT aggregates

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45507: --- Labels: correctness pull-request-available (was: pull-request-available) > Correctness bug in

[jira] [Updated] (SPARK-46092) Overflow in Parquet row group filter creation causes incorrect results

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-46092: --- Labels: correctness pull-request-available (was: pull-request-available) > Overflow in Parquet row

[jira] [Updated] (SPARK-45386) Correctness issue when persisting using StorageLevel.NONE

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45386: --- Labels: correctness pull-request-available (was: pull-request-available) > Correctness issue when

[jira] [Updated] (SPARK-44871) Fix PERCENTILE_DISC behaviour

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-44871: --- Labels: correctness (was: ) > Fix PERCENTILE_DISC behaviour > - > >

[jira] [Updated] (SPARK-43393) Sequence expression can overflow

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-43393: --- Labels: correctness pull-request-available (was: pull-request-available) > Sequence expression can

[jira] [Updated] (SPARK-43240) df.describe() method may- return wrong result if the last RDD is RDD[UnsafeRow]

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-43240: --- Labels: correctness (was: ) > df.describe() method may- return wrong result if the last RDD is >

[jira] [Updated] (SPARK-43098) Should not handle the COUNT bug when the GROUP BY clause of a correlated scalar subquery is non-empty

2023-11-24 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-43098: --- Labels: correctness (was: ) > Should not handle the COUNT bug when the GROUP BY clause of a

[jira] [Updated] (SPARK-45568) WholeStageCodegenSparkSubmitSuite flakiness

2023-11-23 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45568: --- Component/s: Tests > WholeStageCodegenSparkSubmitSuite flakiness >

[jira] [Updated] (SPARK-45751) The default value of ‘spark.executor.logs.rolling.maxRetainedFiles' on the official website is incorrect

2023-11-23 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45751: --- Component/s: Documentation > The default value of ‘spark.executor.logs.rolling.maxRetainedFiles' on

[jira] [Updated] (SPARK-45791) Rename `SparkConnectSessionHodlerSuite.scala` to `SparkConnectSessionHolderSuite.scala`

2023-11-23 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45791: --- Component/s: Tests > Rename `SparkConnectSessionHodlerSuite.scala` to >

[jira] [Updated] (SPARK-46037) When Left Join build Left, ShuffledHashJoinExec may result in incorrect results

2023-11-21 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-46037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-46037: --- Labels: correctness pull-request-available (was: pull-request-available) > When Left Join build

[jira] [Reopened] (SPARK-4836) Web UI should display separate information for all stage attempts

2023-11-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-4836: --- > Web UI should display separate information for all stage attempts >

[jira] [Updated] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Description: We need to add `--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED` to our JVM options

[jira] [Updated] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Summary: Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on

[jira] [Updated] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 11+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Summary: Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on

[jira] [Updated] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 11+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Description: We need to update the    ``` val f =

[jira] [Updated] (SPARK-45508) Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on Java 9+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Summary: Add "--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED" so Platform can access cleaner on

[jira] [Updated] (SPARK-45508) org.apache.spark.unsafe.Platform uses wrong cleaner class name in JDK 9.b110+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Description: In JDK >= 9.b110, the code at

[jira] [Updated] (SPARK-45508) org.apache.spark.unsafe.Platform uses wrong cleaner class name in JDK 9.b110+

2023-10-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-45508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-45508: --- Description: In JDK >= 9.b110, the code at

[jira] [Created] (SPARK-45508) org.apache.spark.unsafe.Platform uses wrong cleaner class name in JDK 11+

2023-10-11 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-45508: -- Summary: org.apache.spark.unsafe.Platform uses wrong cleaner class name in JDK 11+ Key: SPARK-45508 URL: https://issues.apache.org/jira/browse/SPARK-45508 Project: Spark

[jira] [Resolved] (SPARK-42205) Remove logging of Accumulables in Task/Stage start events in JsonProtocol

2023-09-29 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-42205. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 39767

[jira] [Assigned] (SPARK-44920) Use await() instead of awaitUninterruptibly() in TransportClientFactory.createClient()

2023-08-22 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-44920: -- Assignee: Josh Rosen > Use await() instead of awaitUninterruptibly() in >

[jira] [Created] (SPARK-44920) Use await() instead of awaitUninterruptibly() in TransportClientFactory.createClient()

2023-08-22 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-44920: -- Summary: Use await() instead of awaitUninterruptibly() in TransportClientFactory.createClient() Key: SPARK-44920 URL: https://issues.apache.org/jira/browse/SPARK-44920

[jira] [Resolved] (SPARK-44818) Fix race for pending interrupt issued before taskThread is initialized

2023-08-21 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-44818. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 42504

[jira] [Assigned] (SPARK-44818) Fix race for pending interrupt issued before taskThread is initialized

2023-08-21 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-44818: -- Assignee: Anish Shrigondekar > Fix race for pending interrupt issued before taskThread is

[jira] [Resolved] (SPARK-43300) Cascade failure in Guava cache due to fate-sharing

2023-05-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-43300. Fix Version/s: 3.5.0 Resolution: Fixed > Cascade failure in Guava cache due to

[jira] [Commented] (SPARK-43300) Cascade failure in Guava cache due to fate-sharing

2023-05-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722959#comment-17722959 ] Josh Rosen commented on SPARK-43300: Fixed in https://github.com/apache/spark/pull/40982 > Cascade

[jira] [Assigned] (SPARK-43300) Cascade failure in Guava cache due to fate-sharing

2023-05-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-43300: -- Assignee: Ziqi Liu > Cascade failure in Guava cache due to fate-sharing >

[jira] [Created] (SPARK-43414) Fix flakiness in Kafka RDD suites due to port binding configuration issue

2023-05-08 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-43414: -- Summary: Fix flakiness in Kafka RDD suites due to port binding configuration issue Key: SPARK-43414 URL: https://issues.apache.org/jira/browse/SPARK-43414 Project: Spark

[jira] [Updated] (SPARK-42754) Spark 3.4 history server's SQL tab incorrectly groups SQL executions when replaying event logs from Spark 3.3 and earlier

2023-03-10 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-42754: --- Description: In Spark 3.4.0 RC4, the Spark History Server's SQL tab incorrectly groups SQL

[jira] [Created] (SPARK-42754) Spark 3.4 history server's SQL tab incorrectly groups SQL executions when replaying event logs from Spark 3.3 and earlier

2023-03-10 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-42754: -- Summary: Spark 3.4 history server's SQL tab incorrectly groups SQL executions when replaying event logs from Spark 3.3 and earlier Key: SPARK-42754 URL:

[jira] [Updated] (SPARK-42754) Spark 3.4 history server's SQL tab incorrectly groups SQL executions when replaying event logs from Spark 3.3 and earlier

2023-03-10 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-42754: --- Attachment: example.png > Spark 3.4 history server's SQL tab incorrectly groups SQL executions when

[jira] [Assigned] (SPARK-42206) Omit "Task Executor Metrics" field in JsonProtocol output if values are all zero

2023-01-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-42206: -- Assignee: Josh Rosen > Omit "Task Executor Metrics" field in JsonProtocol output if values

[jira] [Assigned] (SPARK-42205) Remove logging of Accumulables in Task/Stage start events in JsonProtocol

2023-01-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-42205: -- Assignee: Josh Rosen > Remove logging of Accumulables in Task/Stage start events in

[jira] [Assigned] (SPARK-42204) Remove redundant logging of TaskMetrics internal accumulators in JsonProtocol event logs

2023-01-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-42204: -- Assignee: Josh Rosen > Remove redundant logging of TaskMetrics internal accumulators in

[jira] [Created] (SPARK-42206) Omit "Task Executor Metrics" field in JsonProtocol output if values are all zero

2023-01-26 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-42206: -- Summary: Omit "Task Executor Metrics" field in JsonProtocol output if values are all zero Key: SPARK-42206 URL: https://issues.apache.org/jira/browse/SPARK-42206

[jira] [Created] (SPARK-42205) Remove logging of Accumulables in Task/Stage start events in JsonProtocol

2023-01-26 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-42205: -- Summary: Remove logging of Accumulables in Task/Stage start events in JsonProtocol Key: SPARK-42205 URL: https://issues.apache.org/jira/browse/SPARK-42205 Project: Spark

[jira] [Created] (SPARK-42204) Remove redundant logging of TaskMetrics internal accumulators in JsonProtocol event logs

2023-01-26 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-42204: -- Summary: Remove redundant logging of TaskMetrics internal accumulators in JsonProtocol event logs Key: SPARK-42204 URL: https://issues.apache.org/jira/browse/SPARK-42204

[jira] [Created] (SPARK-42203) JsonProtocol should skip logging of push-based shuffle read metrics when push-based shuffle is disabled

2023-01-26 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-42203: -- Summary: JsonProtocol should skip logging of push-based shuffle read metrics when push-based shuffle is disabled Key: SPARK-42203 URL:

[jira] [Updated] (SPARK-42203) JsonProtocol should skip logging of push-based shuffle read metrics when push-based shuffle is disabled

2023-01-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-42203: --- Description: This is a followup to SPARK-36620: When push-based shuffle is disabled (the default),

[jira] [Created] (SPARK-41541) Fix wrong child call in SQLShuffleWriteMetricsReporter.decRecordsWritten()

2022-12-15 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-41541: -- Summary: Fix wrong child call in SQLShuffleWriteMetricsReporter.decRecordsWritten() Key: SPARK-41541 URL: https://issues.apache.org/jira/browse/SPARK-41541 Project:

[jira] [Updated] (SPARK-38542) UnsafeHashedRelation should serialize numKeys out

2022-09-02 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-38542: --- Labels: correctness (was: ) > UnsafeHashedRelation should serialize numKeys out >

[jira] [Assigned] (SPARK-40261) DirectTaskResult meta should not be counted into result size

2022-08-31 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-40261: -- Assignee: Ziqi Liu > DirectTaskResult meta should not be counted into result size >

[jira] [Resolved] (SPARK-40261) DirectTaskResult meta should not be counted into result size

2022-08-31 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-40261. Fix Version/s: 3.4.0 Resolution: Fixed Fixed by https://github.com/apache/spark/pull/37713

[jira] [Resolved] (SPARK-40235) Use interruptible lock instead of synchronized in Executor.updateDependencies()

2022-08-29 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-40235. Fix Version/s: 3.4.0 Resolution: Fixed Fixed by

[jira] [Created] (SPARK-40263) Use interruptible lock instead of synchronized in TransportClientFactory.createClient()

2022-08-29 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-40263: -- Summary: Use interruptible lock instead of synchronized in TransportClientFactory.createClient() Key: SPARK-40263 URL: https://issues.apache.org/jira/browse/SPARK-40263

[jira] [Resolved] (SPARK-40211) Allow executeTake() / collectLimit's number of starting partitions to be customized

2022-08-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-40211. Fix Version/s: 3.4.0 Assignee: Ziqi Liu Resolution: Fixed Resolved by

[jira] [Updated] (SPARK-40235) Use interruptible lock instead of synchronized in Executor.updateDependencies()

2022-08-26 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-40235: --- Description: This patch modifies the synchronization in {{Executor.updateDependencies()}} in order

[jira] [Created] (SPARK-40235) Use interruptible lock instead of synchronized in Executor.updateDependencies()

2022-08-26 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-40235: -- Summary: Use interruptible lock instead of synchronized in Executor.updateDependencies() Key: SPARK-40235 URL: https://issues.apache.org/jira/browse/SPARK-40235 Project:

[jira] [Assigned] (SPARK-40106) Task failure handlers should always run if the task failed

2022-08-18 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-40106: -- Assignee: Ryan Johnson > Task failure handlers should always run if the task failed >

[jira] [Resolved] (SPARK-40106) Task failure handlers should always run if the task failed

2022-08-18 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-40106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-40106. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37531

[jira] [Updated] (SPARK-36176) Expose tableExists in pyspark.sql.catalog

2022-08-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-36176: --- Fix Version/s: 3.3.0 (was: 3.2.0) > Expose tableExists in

[jira] [Commented] (SPARK-36176) Expose tableExists in pyspark.sql.catalog

2022-08-11 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578592#comment-17578592 ] Josh Rosen commented on SPARK-36176: Changed the Fix Version on JIRA: this landed in 3.3.0, not

[jira] [Resolved] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-10 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-39983. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37413

[jira] [Assigned] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-10 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-39983: -- Assignee: Alex Balikov > Should not cache unserialized broadcast relations on the driver >

[jira] [Updated] (SPARK-39973) Avoid noisy warnings logs when spark.scheduler.listenerbus.metrics.maxListenerClassesTimed = 0

2022-08-03 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-39973: --- Description: If {{spark.scheduler.listenerbus.metrics.maxListenerClassesTimed}} has been set to

[jira] [Created] (SPARK-39973) Avoid noisy warnings logs when spark.scheduler.listenerbus.metrics.maxListenerClassesTimed = 0

2022-08-03 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39973: -- Summary: Avoid noisy warnings logs when spark.scheduler.listenerbus.metrics.maxListenerClassesTimed = 0 Key: SPARK-39973 URL: https://issues.apache.org/jira/browse/SPARK-39973

[jira] [Created] (SPARK-39901) Reconsider design of ignoreCorruptFiles feature

2022-07-27 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39901: -- Summary: Reconsider design of ignoreCorruptFiles feature Key: SPARK-39901 URL: https://issues.apache.org/jira/browse/SPARK-39901 Project: Spark Issue Type:

[jira] [Created] (SPARK-39864) ExecutionListenerManager's registration of the ExecutionListenerBus should be lazy

2022-07-25 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39864: -- Summary: ExecutionListenerManager's registration of the ExecutionListenerBus should be lazy Key: SPARK-39864 URL: https://issues.apache.org/jira/browse/SPARK-39864

[jira] [Updated] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-07-25 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-39833: --- Labels: correctness (was: ) > Filtered parquet data frame count() and show() produce inconsistent

[jira] [Created] (SPARK-39847) Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary()

2022-07-22 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39847: -- Summary: Race condition related to interruption of task threads while they are in RocksDBLoader.loadLibrary() Key: SPARK-39847 URL: https://issues.apache.org/jira/browse/SPARK-39847

[jira] [Updated] (SPARK-39771) If spark.default.parallelism is unset, RDD defaultPartitioner may pick a value that is too large to successfully run

2022-07-13 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-39771: --- Description: [According to its

[jira] [Created] (SPARK-39771) If spark.default.parallelism is unset, RDD defaultPartitioner may pick a value that is too large to successfully run

2022-07-13 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39771: -- Summary: If spark.default.parallelism is unset, RDD defaultPartitioner may pick a value that is too large to successfully run Key: SPARK-39771 URL:

[jira] [Updated] (SPARK-38787) Possible correctness issue on stream-stream join when handling edge case

2022-07-07 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-38787: --- Labels: correctness (was: ) > Possible correctness issue on stream-stream join when handling edge

[jira] [Updated] (SPARK-37643) when charVarcharAsString is true, char datatype partition table query incorrect

2022-07-07 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-37643: --- Labels: correctness (was: ) > when charVarcharAsString is true, char datatype partition table

[jira] [Updated] (SPARK-39702) Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel buffer

2022-07-06 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-39702: --- Component/s: YARN > Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel >

[jira] [Created] (SPARK-39702) Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel buffer

2022-07-06 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39702: -- Summary: Reduce memory overhead of TransportCipher$EncryptedMessage's byteRawChannel buffer Key: SPARK-39702 URL: https://issues.apache.org/jira/browse/SPARK-39702

[jira] [Updated] (SPARK-37865) Spark should not dedup the groupingExpressions when the first child of Union has duplicate columns

2022-07-06 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-37865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-37865: --- Labels: correctness (was: ) > Spark should not dedup the groupingExpressions when the first child

[jira] [Resolved] (SPARK-39489) Improve EventLoggingListener and ReplayListener performance by replacing Json4S ASTs with Jackson trees

2022-07-03 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-39489. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36885

[jira] [Created] (SPARK-39658) Reconsider exposure of Json4s symbols in public ResourceInformation.toJson API

2022-07-01 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39658: -- Summary: Reconsider exposure of Json4s symbols in public ResourceInformation.toJson API Key: SPARK-39658 URL: https://issues.apache.org/jira/browse/SPARK-39658 Project:

[jira] [Resolved] (SPARK-39636) Fix multiple small bugs in JsonProtocol, impacting StorageLevel and Task/Executor resource requests

2022-06-30 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-39636. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37027

[jira] [Created] (SPARK-39636) Fix multiple small bugs in JsonProtocol, impacting StorageLevel and Task/Executor resource requests

2022-06-29 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39636: -- Summary: Fix multiple small bugs in JsonProtocol, impacting StorageLevel and Task/Executor resource requests Key: SPARK-39636 URL: https://issues.apache.org/jira/browse/SPARK-39636

[jira] [Commented] (SPARK-17728) UDFs are run too many times

2022-06-22 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557756#comment-17557756 ] Josh Rosen commented on SPARK-17728: As of SPARK-36718 in Spark 3.3 I think the

[jira] [Created] (SPARK-39489) Improve EventLoggingListener and ReplayListener performance by replacing Json4S ASTs with Jackson trees

2022-06-15 Thread Josh Rosen (Jira)
Josh Rosen created SPARK-39489: -- Summary: Improve EventLoggingListener and ReplayListener performance by replacing Json4S ASTs with Jackson trees Key: SPARK-39489 URL:

[jira] [Resolved] (SPARK-39465) Log4j version upgrade to 2.17.2

2022-06-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-39465. Resolution: Done > Log4j version upgrade to 2.17.2 > --- > >

[jira] [Reopened] (SPARK-39465) Log4j version upgrade to 2.17.2

2022-06-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-39465: > Log4j version upgrade to 2.17.2 > --- > > Key:

[jira] [Comment Edited] (SPARK-39465) Log4j version upgrade to 2.17.2

2022-06-15 Thread Josh Rosen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554844#comment-17554844 ] Josh Rosen edited comment on SPARK-39465 at 6/16/22 1:21 AM: - Spark uses

  1   2   3   4   5   6   7   8   9   10   >