[GitHub] [spark] zhenlineo opened a new pull request, #41704: [WIP][Connect] Handle Row input for UDFs

via GitHub Thu, 22 Jun 2023 15:31:56 -0700


zhenlineo opened a new pull request, #41704:
URL: https://github.com/apache/spark/pull/41704


   ### What changes were proposed in this pull request?
   If the client passes Rows as inputs to UDFs, the Spark connect planner will 
fail to create the RowEncoder. 
   
   The Row encoder sent by the client contains no field or schema information. 
Instead the real schema should be obtained from the input data.
   
   This PR fix the immediate problem by obtain the input schema from the 
plan.output.
   It only handles the Row encoder at the top level, as it is the only allowed 
way to use UnboundRowEncoder currently.
   
   ### Why are the changes needed?
   Fix the bug where the Row cannot be used as UDF inputs.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   E2E tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zhenlineo opened a new pull request, #41704: [WIP][Connect] Handle Row input for UDFs

Reply via email to