Hi,

Looking for comments/your view:

Would it be possible to:
1. patch datafusion dataframe to make df.state public
2. patch datafusion adding method to  dataframe ie:
df.transform_logical_plan(mut self, new_plan) -> df where some
original plan could be modified / injected with NewPlanNode
(UserDefinedPlanNode).

Reason:
I'm working on "writer to kafka topic", on top of datafusion using
ballista - to use proper distribution I need to change dataframe
output to be processed/sent on each executor.
To do this currently I need to have access to both dataframe and
context: I need to get a state to change dataframe on-the-fly to
inject it with my own UserDefinedLogicalNode.

Current code works, but looks little "messy":
df.write(ballista_ctx, "kafka://topic:port?brokers", Format::JSON);

if I had public access to df.state that would look like:
df.write_json("kafka://topic:port?brokers");


Cheers,
Jaro

Reply via email to