[DISCUSS][DataFusion] make dfschema wrap schemaref

2024-03-20 Thread huaijin hao
Hi, We want to make dfschema wrap arrow schema to improve datafusion planner performance, as discussed at [1], [2], [3]. We welcome anyone interested in participating in the reviews. Please take a look and add comments/suggestions to the PR [3]. Thanks, huaijin [1]:

Re: [DISCUSS] [DataFusion] Unifying BuiltIn and User Defined Functions

2024-02-12 Thread Andrew Lamb
Update here is that we are making good progress towards the goal of removing the distinction between user / built in functions. If you have any feedback on this project or how you would like the functions structured, please join the conversation on [1]. Thanks Andrew [1]:

[DISCUSS] [DATAFUSION] Minimum Supported Rust Version Policy

2024-01-31 Thread Andrew Lamb
Hello, We are thinking about a min rust version policy for DataFusion[1]. Please leave any thoughts you would like to share on the ticket. Thank you, Andrew [1]: https://github.com/apache/arrow-datafusion/issues/9082

[DISCUSS] [DataFusion] Unifying BuiltIn and User Defined Functions

2024-01-23 Thread Andrew Lamb
I would like to bring attention to the project of unifying built in and user defined functions [1], and specifically a PR[2] that starts implementing that approach. Please provide any feedback you have on the ticket or PR. Thank you, Andrew [1]:

Re: [DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread vin jake
The list looks good to me. Thanks Andrew.

Re: [DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread Daniël Heres
Thanks for sharing the information. LGTM. On Wed, Dec 20, 2023, 21:54 L. C. Hsieh wrote: > The list looks good to me too. > > Thanks Andrew. > > On Wed, Dec 20, 2023 at 12:31 PM Raphael Taylor-Davies > wrote: > > > > > thus not join the DataFusion PMC > > > > This is correct, I don't

Re: [DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread L. C. Hsieh
The list looks good to me too. Thanks Andrew. On Wed, Dec 20, 2023 at 12:31 PM Raphael Taylor-Davies wrote: > > > thus not join the DataFusion PMC > > This is correct, I don't currently have sufficient bandwidth to be able to > perform such a role to the level that I would expect of others, in

Re: [DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread Raphael Taylor-Davies
> thus not join the DataFusion PMC This is correct, I don't currently have sufficient bandwidth to be able to perform such a role to the level that I would expect of others, in addition to my existing commitments. Kind Regards, Raphael On 20 December 2023 20:19:56 GMT, Andy Grove wrote:

Re: [DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread Andy Grove
This list LGTM. On Wed, Dec 20, 2023 at 1:11 PM Andrew Lamb wrote: > Hello, > > As we have discussed previously [1], we are planning to propose [2] > "graduating" the DataFusion to its own top level Apache project. > > I would like to discuss the initial PMC members for the new top level >

[DISCUSS] [DATAFUSION] PMC for new DataFusion top level project

2023-12-20 Thread Andrew Lamb
Hello, As we have discussed previously [1], we are planning to propose [2] "graduating" the DataFusion to its own top level Apache project. I would like to discuss the initial PMC members for the new top level project. The suggestion in [1] is > All existing Arrow Committers and PMC members

[DISCUSS] DataFusion Meetup

2023-11-13 Thread Andrew Lamb
I am attempting to organize a face to face meetup of DataFusion community if anyone is interested[1] and would like to solicit input from anyone who has it Andrew [1]: https://github.com/apache/arrow-datafusion/discussions/8152

[DISCUSS] [DataFusion] Unify Function Interface (remove BuiltInScalarFunction)

2023-11-08 Thread Andrew Lamb
I would like to bring wider visibility to a proposal[1] to unify the function interface in DataFusion. The TLDR is to remove BuiltInScalarFunction and ensure all functions can implemented as `ScalarUDF` There is a small API change to ScalarUDF proposed as well[2]. If you are interested, please

Re: Re: [DISCUSS][DataFusion] Table time travel support

2023-08-21 Thread Akshara Uke
Thanks Marko for the detailed answer and the references. Cheers! On Mon, Aug 21, 2023 at 1:57 PM Marko Grujic wrote: > Hi Akshara, > > > Just for my understanding - the proposal assumes that writes will result > in > > a new table version correct? > > Actually, the implementation I had in mind

RE: Re: [DISCUSS][DataFusion] Table time travel support

2023-08-21 Thread Marko Grujic
Hi Akshara, > Just for my understanding - the proposal assumes that writes will result in > a new table version correct? Actually, the implementation I had in mind does not make any assumptions about the behaviour of writes, it only accounts for the fact that there may be different versions of

Re: [DISCUSS][DataFusion] Table time travel support

2023-08-19 Thread Akshara Uke
Hi Marko, Indeed most databases do support time travel/stale reads (specially distributed databases) , hence an important feature,IMHO. Just for my understanding - the proposal assumes that writes will result in a new table version correct? Asking since, some databases provide stale read support

[DISCUSS][DataFusion] Table time travel support

2023-08-17 Thread Marko Grujic
Hi all! I'm wondering what people think of a possibility to extend DataFusion so as to accommodate time-travel querying? This would work well with the new table formats, particularly Iceberg and Delta Lake, where table versioning is at the core of the protocol. You can see some details in the

Re: [DISCUSS] DataFusion changes to timestamp arithmetic and decimal division

2023-08-08 Thread Andrew Lamb
The PR is now merged. Thank you very much for everyone who added their commentary On Thu, Aug 3, 2023 at 8:27 AM Andrew Lamb wrote: > BTW I would like to bring attention to the following PR: [1] > > It has some non trivial changes: > 1. date/time arithmetic is done using Durations (e.g. X

[DISCUSS] DataFusion changes to timestamp arithmetic and decimal division

2023-08-03 Thread Andrew Lamb
BTW I would like to bring attention to the following PR: [1] It has some non trivial changes: 1. date/time arithmetic is done using Durations (e.g. X milliseconds) rather than Intervals (e.g. "months") which makes behavior consistent 2. Changes the output type of Decimal128 division to avoid

[DISCUSS] [DataFusion]

2023-06-01 Thread Andrew Lamb
I would like to invite anyone with opinions or perspectives from the community to participate in two ongoing discussions about DataFusion and its future. * Move Apache Arrow Datafusion to a new top level Apache projection [1] * Goals / Vision for DataFusion [2] Thank you, Andrew [1]:

[DISCUSS] [DataFusion] Memory Manager changes

2022-12-07 Thread Andrew Lamb
I would like to solicit opinions on a PR with proposed changes to the DataFusion memory manager scheme [1]. If you would like to offer feedback, please do so on the PR Thank you, Andrew [1] https://github.com/apache/arrow-datafusion/pull/4522

[RUST] Discuss: Datafusion 7.0.0 release

2022-02-03 Thread Andrew Lamb
I am hoping to start preparing for a release of datafusion 7.0.0 to crates.io in the next few days: Let's coordinate on [1] Thanks Andrew [1] https://github.com/apache/arrow-datafusion/issues/1587

[Rust][DataFusion][DISCUSS] DataFusion Resource(Memory + Disk) API

2022-01-12 Thread Andrew Lamb
Greetings fellow Rustaceans, I wanted to bring some extra attention to a PR[1] contributed by Yijie Shen that adds an initial resource management API to datafusion. While the implementation will likely undergo the normal iteration, iterating on the APIs will likely be more disruptive. Thus I