I filed an issue for adding writes to the logical plan in DataFusion a while back. It would be a good addition. Spark does something similar.
https://github.com/apache/arrow-datafusion/issues/5076 Thanks, Andy. On Sun, Apr 2, 2023 at 5:57 AM Metehan Yıldırım <metehan.yildi...@synnada.ai> wrote: > Hi, > > What are the differences in requirements? I am not completely familiar with > the Ballista requirements, for example I am not sure whether final data may > end up in different hosts or not. > > About LogicalPlan, is there an example usage in other OLAPs? Maybe you can > find a better fitting solution with LogicalPlans by checking how others > handle exporting? > > Best Regards, > Mete. > > On 2 Apr 2023 Sun at 11:23 Jaroslaw Nowosad <yare...@gmail.com> wrote: > > > Hi, > > > > Thanks for your response. > > > > Sorry - my bad, didn't specify it clearly. However, I will check your > > solution. > > What I'm looking for is Ballista - I need a distributed version of > > export/save, currently on Ballista you can only read, like > > S3(minio)/HDFS, but after processing I need to save the output ... put > > back to S3. > > At the moment I figure that probably easiest way will be by using > > object_store. > > From what I see it should be done by executors not driver - that's why > > I start thinking about a logical plan. > > > > Best Regards, > > Jaro > > > > > > On Sat, Apr 1, 2023 at 9:23 PM Metehan Yıldırım > > <metehan.yildi...@synnada.ai> wrote: > > > > > > Hi, > > > > > > As far as I know, exporting data from a SQL database to a CSV file or > > other > > > external file format is typically not considered part of the logical > plan > > > for executing a SQL query. > > > > > > At present, I am developing a table sink feature in Datafusion, where I > > > have successfully added new APIs (insert_into and copy_to) to the > > > TableProvider trait. Although I have not yet submitted the PR, the new > > APIs > > > are functioning well. > > > > > > Listing table INSERT INTO support (WAITING ARROW / OBJECT STORE UPDATE) > > by > > > metesynnada · Pull Request #62 · synnada-ai/arrow-datafusion ( > github.com > > ) > > > <https://github.com/synnada-ai/arrow-datafusion/pull/62> > > > > > > Tt should be noted that the object_store crate is primarily responsible > > for > > > the main functionality of the table sink feature. It provides ample > > support > > > for file sinking related to listing tables. If you are looking for > > support > > > beyond this, I'd like to hear the use for more help. > > > > > > Mete. > > > > > > On Sat, Apr 1, 2023 at 11:07 PM Jaroslaw Nowosad <yare...@gmail.com> > > wrote: > > > > > > > Hi, > > > > > > > > Looking for advice: > > > > I'm looking into creating a writer part for ballista. > > > > There is a data source but not a sink. > > > > I started looking into object store -> put/put_multipart. > > > > But looks like simple context extension is not enough - do I need to > > > > extend logical/physical plan? > > > > > > > > If you have any pointers... > > > > > > > > Best Regards, > > > > Jaro > > > > > > >