karenfeng opened a new pull request #32773:
URL: https://github.com/apache/spark/pull/32773
### What changes were proposed in this pull request?
Sets `references` for `NamedLambdaVariable` and `LambdaFunction`.
| Expression | NamedLambdaVariable | LambdaFunction |
| --- | --- | --- |
| References before | None | All function references |
| References after | self.toAttribute | Function references minus arguments'
references |
In `NestedColumnAliasing`, this means that `ExtractValue(ExtractValue(attr,
lv: NamedLambdaVariable), ...)` now references both `attr` and `lv`, rather
than just `attr`. As a result, it will not be included in the nested column
references.
### Why are the changes needed?
Before, lambda key was referenced outside of lambda function.
#### Example 1
Before:
```
Project [transform(keys#0, lambdafunction(_extract_v1#0, lambda key#0,
false)) AS a#0]
+- 'Join Cross
:- Project [kvs#0[lambda key#0].v1 AS _extract_v1#0]
: +- LocalRelation <empty>, [kvs#0]
+- LocalRelation <empty>, [keys#0]
```
After:
```
Project [transform(keys#418, lambdafunction(kvs#417[lambda key#420].v1,
lambda key#420, false)) AS a#419]
+- Join Cross
:- LocalRelation <empty>, [kvs#417]
+- LocalRelation <empty>, [keys#418]
```
#### Example 2
Before:
```
Project [transform(keys#0, lambdafunction(kvs#0[lambda key#0].v1, lambda
key#0, false)) AS a#0]
+- GlobalLimit 5
+- LocalLimit 5
+- Project [keys#0, _extract_v1#0 AS _extract_v1#0]
+- GlobalLimit 5
+- LocalLimit 5
+- Project [kvs#0[lambda key#0].v1 AS _extract_v1#0, keys#0]
+- LocalRelation <empty>, [kvs#0, keys#0]
```
After:
```
Project [transform(keys#428, lambdafunction(kvs#427[lambda key#430].v1,
lambda key#430, false)) AS a#429]
+- GlobalLimit 5
+- LocalLimit 5
+- Project [keys#428, kvs#427]
+- GlobalLimit 5
+- LocalLimit 5
+- LocalRelation <empty>, [kvs#427, keys#428]
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Scala unit tests for the examples above
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]