I filed an issue for adding writes to the logical plan in DataFusion a
while back. It would be a good addition. Spark does something similar.

https://github.com/apache/arrow-datafusion/issues/5076

Thanks,

Andy.



On Sun, Apr 2, 2023 at 5:57 AM Metehan Yıldırım <metehan.yildi...@synnada.ai>
wrote:

> Hi,
>
> What are the differences in requirements? I am not completely familiar with
> the Ballista requirements, for example I am not sure whether final data may
> end up in different hosts or not.
>
> About LogicalPlan, is there an example usage in other OLAPs? Maybe you can
> find a better fitting solution with LogicalPlans by checking how others
> handle exporting?
>
> Best Regards,
> Mete.
>
> On 2 Apr 2023 Sun at 11:23 Jaroslaw Nowosad <yare...@gmail.com> wrote:
>
> > Hi,
> >
> > Thanks for your response.
> >
> > Sorry - my bad, didn't specify it clearly. However, I will check your
> > solution.
> > What I'm looking for is Ballista - I need a distributed version of
> > export/save, currently on Ballista you can only read, like
> > S3(minio)/HDFS, but after processing I need to save the output ... put
> > back to S3.
> > At the moment I figure that probably easiest way will be by using
> > object_store.
> > From what I see it should be done by executors not driver - that's why
> > I start thinking about a logical plan.
> >
> > Best Regards,
> > Jaro
> >
> >
> > On Sat, Apr 1, 2023 at 9:23 PM Metehan Yıldırım
> > <metehan.yildi...@synnada.ai> wrote:
> > >
> > > Hi,
> > >
> > > As far as I know, exporting data from a SQL database to a CSV file or
> > other
> > > external file format is typically not considered part of the logical
> plan
> > > for executing a SQL query.
> > >
> > > At present, I am developing a table sink feature in Datafusion, where I
> > > have successfully added new APIs (insert_into and copy_to) to the
> > > TableProvider trait. Although I have not yet submitted the PR, the new
> > APIs
> > > are functioning well.
> > >
> > > Listing table INSERT INTO support (WAITING ARROW / OBJECT STORE UPDATE)
> > by
> > > metesynnada · Pull Request #62 · synnada-ai/arrow-datafusion (
> github.com
> > )
> > > <https://github.com/synnada-ai/arrow-datafusion/pull/62>
> > >
> > > Tt should be noted that the object_store crate is primarily responsible
> > for
> > > the main functionality of the table sink feature. It provides ample
> > support
> > > for file sinking related to listing tables. If you are looking for
> > support
> > > beyond this, I'd like to hear the use for more help.
> > >
> > > Mete.
> > >
> > > On Sat, Apr 1, 2023 at 11:07 PM Jaroslaw Nowosad <yare...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > Looking for advice:
> > > > I'm looking into creating a writer part for ballista.
> > > > There is a data source but not a sink.
> > > > I started looking into object store -> put/put_multipart.
> > > > But looks like simple context extension is not enough - do I need to
> > > > extend logical/physical plan?
> > > >
> > > > If you have any pointers...
> > > >
> > > > Best Regards,
> > > > Jaro
> > > >
> >
>

Reply via email to