Hi Jaro, I think this discussion would be more visible to the DataFusion developers if you filed a ticket or discussion in the repository [1]
[1]: https://github.com/apache/arrow-datafusion On Thu, Sep 21, 2023 at 4:47 AM Jaroslaw Nowosad <yare...@gmail.com> wrote: > Hi, > > Looking for comments/your view: > > Would it be possible to: > 1. patch datafusion dataframe to make df.state public > 2. patch datafusion adding method to dataframe ie: > df.transform_logical_plan(mut self, new_plan) -> df where some > original plan could be modified / injected with NewPlanNode > (UserDefinedPlanNode). > > Reason: > I'm working on "writer to kafka topic", on top of datafusion using > ballista - to use proper distribution I need to change dataframe > output to be processed/sent on each executor. > To do this currently I need to have access to both dataframe and > context: I need to get a state to change dataframe on-the-fly to > inject it with my own UserDefinedLogicalNode. > > Current code works, but looks little "messy": > df.write(ballista_ctx, "kafka://topic:port?brokers", Format::JSON); > > if I had public access to df.state that would look like: > df.write_json("kafka://topic:port?brokers"); > > > Cheers, > Jaro >