[ 
https://issues.apache.org/jira/browse/SPARK-31108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-31108.
----------------------------------
    Resolution: Invalid

Please ask questions into the dev mailing list 
(https://spark.apache.org/community.html)

> Parameter cannot be passed to pandas udf of type map_iter
> ---------------------------------------------------------
>
>                 Key: SPARK-31108
>                 URL: https://issues.apache.org/jira/browse/SPARK-31108
>             Project: Spark
>          Issue Type: Question
>          Components: Examples
>    Affects Versions: 3.0.0
>            Reporter: xge
>            Priority: Major
>
> Parameters can only be passed in the following way:
> ********************************************************************
> from pyspark.sql.functions import pandas_udf, PandasUDFType
> def map_iter_pandas_udf_example(spark):
>     strr = "abcd
>     df = spark.createDataFrame([(1, 21),(2,30)],("id", "age")) 
>     @pandas_udf(df.schema, PandasUDFType.MAP_ITER)
>     def filter_func(batch_iter, x = strr):
>         print(x)
>         for pdf in batch_iter:
>             yield pdf[pdf.id == 1]
>     df.mapInPandas(filter_func).show()
> *******************************************************************
>  
> However, if the code edited as follow, error ccurred:
> *******************************************************************
> from pyspark.sql.functions import pandas_udf, PandasUDFType
> def map_iter_pandas_udf_example(spark):
>     strr = "abcd
>     df = spark.createDataFrame([(1, 21),(2,30)],("id", "age")) 
>     @pandas_udf(df.schema, PandasUDFType.MAP_ITER)
>     def filter_func(batch_iter, x = strr):
>         print(x)
>         for pdf in batch_iter:
>             yield pdf[pdf.id == 1]
>     data = "dbca"
>     df.mapInPandas(filter_func(data)).show()
> *******************************************************************
> ValueError: Invalid udf: the udf argument must be a pandas_udf of type 
> MAP_ITER.
> Does anyone know if pandas udf of type map_iter can pass parameters, and if 
> so, how to write the code? Thanks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to