[
https://issues.apache.org/jira/browse/SPARK-38868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-38868.
---------------------------------
Fix Version/s: 3.1.3
3.3.0
3.2.2
Resolution: Fixed
> `assert_true` fails unconditionnaly after `left_outer` joins
> ------------------------------------------------------------
>
> Key: SPARK-38868
> URL: https://issues.apache.org/jira/browse/SPARK-38868
> Project: Spark
> Issue Type: Bug
> Components: PySpark, SQL
> Affects Versions: 3.1.1, 3.1.2, 3.2.0, 3.2.1, 3.3.0, 3.4.0
> Reporter: Fabien Dubosson
> Priority: Major
> Fix For: 3.1.3, 3.3.0, 3.2.2
>
>
> When `assert_true` is used after a `left_outer` join the assert exception is
> raised even though all the rows meet the condition. Using an `inner` join
> does not expose this issue.
>
> {code:java}
> from pyspark.sql import SparkSession
> from pyspark.sql import functions as sf
> session = SparkSession.builder.getOrCreate()
> entries = session.createDataFrame(
> [
> ("a", 1),
> ("b", 2),
> ("c", 3),
> ],
> ["id", "outcome_id"],
> )
> outcomes = session.createDataFrame(
> [
> (1, 12),
> (2, 34),
> (3, 32),
> ],
> ["outcome_id", "outcome_value"],
> )
> # Inner join works as expected
> (
> entries.join(outcomes, on="outcome_id", how="inner")
> .withColumn("valid", sf.assert_true(sf.col("outcome_value") > 10))
> .filter(sf.col("valid").isNull())
> .show()
> )
> # Left join fails with «'('outcome_value > 10)' is not true!» even though it
> is the case
> (
> entries.join(outcomes, on="outcome_id", how="left_outer")
> .withColumn("valid", sf.assert_true(sf.col("outcome_value") > 10))
> .filter(sf.col("valid").isNull())
> .show()
> ){code}
> Reproduced on `pyspark` versions: `3.2.1`, `3.2.0`, `3.1.2` and `3.1.1`. I am
> not sure if "native" Spark exposes this issue as well or not, I don't have
> the knowledge/setup to test that.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]