[
https://issues.apache.org/jira/browse/SPARK-57858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikolina Vraneš updated SPARK-57858:
------------------------------------
Description:
The BIN BY relation operator proportionally rescales its DISTRIBUTE UNIFORM
columns. The logical BinBy node currently carries those columns through
child.output with the child's own ExprId, even though execution rewrites their
values, which violates Catalyst's invariant that an equal ExprId implies an
equal value (no other operator edits a value under a retained child attribute).
This sub-task makes the rescaled DISTRIBUTE columns produced attributes with
fresh ExprIds (same names, types, nullability, and positions), shadowing the
inputs, mirroring Generate.generatorOutput. The input columns stay as the
operator's read inputs but leave output. ResolveBinBy mints them and
DeduplicateRelations renews them across self-joins. Qualifier and metadata are
dropped, matching expr AS value computed-value semantics.
> Emit BIN BY scaled DISTRIBUTE columns as produced attributes
> ------------------------------------------------------------
>
> Key: SPARK-57858
> URL: https://issues.apache.org/jira/browse/SPARK-57858
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 5.0.0
> Reporter: Nikolina Vraneš
> Priority: Major
>
> The BIN BY relation operator proportionally rescales its DISTRIBUTE UNIFORM
> columns. The logical BinBy node currently carries those columns through
> child.output with the child's own ExprId, even though execution rewrites
> their values, which violates Catalyst's invariant that an equal ExprId
> implies an equal value (no other operator edits a value under a retained
> child attribute).
>
> This sub-task makes the rescaled DISTRIBUTE columns produced attributes with
> fresh ExprIds (same names, types, nullability, and positions), shadowing the
> inputs, mirroring Generate.generatorOutput. The input columns stay as the
> operator's read inputs but leave output. ResolveBinBy mints them and
> DeduplicateRelations renews them across self-joins. Qualifier and metadata
> are dropped, matching expr AS value computed-value semantics.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]