dtenedor opened a new pull request, #46918:
URL: https://github.com/apache/spark/pull/46918
### What changes were proposed in this pull request?
This PR fixes a bug that resulted in an internal error with some combination
of the Python UDTF "select" and "partitionBy" options of the "analyze" method.
To reproduce:
```
from pyspark.sql.functions import (
AnalyzeArgument,
AnalyzeResult,
PartitioningColumn,
SelectedColumn,
udtf
)
from pyspark.sql.types import (
DoubleType,
StringType,
StructType,
)
@udtf
class TestTvf:
@staticmethod
def analyze(observed: AnalyzeArgument) -> AnalyzeResult:
out_schema = StructType()
out_schema.add("partition_col", StringType())
out_schema.add("double_col", DoubleType())
return AnalyzeResult(
schema=out_schema,
partitionBy=[PartitioningColumn("partition_col")],
select=[
SelectedColumn("partition_col"),
SelectedColumn("double_col"),
],
)
def eval(self, *args, **kwargs):
pass
def terminate(self):
for _ in range(10):
yield {
"partition_col": None,
"double_col": 1.0,
}
spark.udtf.register("serialize_test", TestTvf)
# Fails
(
spark
.sql(
"""
SELECT * FROM serialize_test(
TABLE(
SELECT
5 AS unused_col,
'hi' AS partition_col,
1.0 AS double_col
UNION ALL
SELECT
4 AS unused_col,
'hi' AS partition_col,
1.0 AS double_col
)
)
"""
)
.toPandas()
)
```
### Why are the changes needed?
The above query returned internal errors before, but works now.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Additional golden file coverage
### Was this patch authored or co-authored using generative AI tooling?
Some light GitHub copilot usage
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]