Huang Xingbo created FLINK-25551:
------------------------------------
Summary: Add example and documentation on the usage of Row in
Python UDTF
Key: FLINK-25551
URL: https://issues.apache.org/jira/browse/FLINK-25551
Project: Flink
Issue Type: Improvement
Components: API / Python, Documentation
Affects Versions: 1.14.2, 1.13.5, 1.15.0
Reporter: Huang Xingbo
The following example comes from pyflink users:
{code:python}
source_table = """
CREATE TABLE source_table (
a ARRAY<ROW<PRODID STRING, ADDMONEY DOUBLE>>,
b ARRAY<DOUBLE>
) WITH (
'connector' = 'datagen',
'number-of-rows' = '10'
)
"""
@udtf(result_types=[DataTypes.STRING(), DataTypes.DOUBLE()])
def split(x: list):
for s in x:
yield s
@udtf(result_types=[
DataTypes.ROW([
DataTypes.FIELD("PRODID",
DataTypes.STRING()),
DataTypes.FIELD("ADDMONEY",
DataTypes.DOUBLE())])])
def split2(x: list):
for s in x:
yield s, # NOTE: This ',' is important
t_env.execute_sql(source_table)
# If you want to split the Row into two columns
t_env.from_path("source_table").join_lateral(split(col('a')).alias('x',
'y')).select("x, y")
# If you want to treat the entire row as a column
t_env.from_path("source_table").join_lateral(split2(col('a')).alias('x')).select("x")
{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)