Hi,
We want to make dfschema wrap arrow schema to improve datafusion planner
performance, as discussed at [1], [2], [3]. We welcome anyone interested in
participating in the reviews.
Please take a look and add comments/suggestions to the PR [3].
Thanks,
huaijin
[1]:
Update here is that we are making good progress towards the goal of
removing the distinction between user / built in functions.
If you have any feedback on this project or how you would like
the functions structured, please join the conversation on [1].
Thanks
Andrew
[1]:
Hello,
We are thinking about a min rust version policy for DataFusion[1]. Please
leave any thoughts you would like to share on the ticket.
Thank you,
Andrew
[1]: https://github.com/apache/arrow-datafusion/issues/9082
I would like to bring attention to the project of unifying built in and
user defined functions [1], and specifically a PR[2] that starts
implementing that approach.
Please provide any feedback you have on the ticket or PR.
Thank you,
Andrew
[1]:
The list looks good to me.
Thanks Andrew.
Thanks for sharing the information.
LGTM.
On Wed, Dec 20, 2023, 21:54 L. C. Hsieh wrote:
> The list looks good to me too.
>
> Thanks Andrew.
>
> On Wed, Dec 20, 2023 at 12:31 PM Raphael Taylor-Davies
> wrote:
> >
> > > thus not join the DataFusion PMC
> >
> > This is correct, I don't
The list looks good to me too.
Thanks Andrew.
On Wed, Dec 20, 2023 at 12:31 PM Raphael Taylor-Davies
wrote:
>
> > thus not join the DataFusion PMC
>
> This is correct, I don't currently have sufficient bandwidth to be able to
> perform such a role to the level that I would expect of others, in
> thus not join the DataFusion PMC
This is correct, I don't currently have sufficient bandwidth to be able to
perform such a role to the level that I would expect of others, in addition to
my existing commitments.
Kind Regards,
Raphael
On 20 December 2023 20:19:56 GMT, Andy Grove wrote:
This list LGTM.
On Wed, Dec 20, 2023 at 1:11 PM Andrew Lamb wrote:
> Hello,
>
> As we have discussed previously [1], we are planning to propose [2]
> "graduating" the DataFusion to its own top level Apache project.
>
> I would like to discuss the initial PMC members for the new top level
>
Hello,
As we have discussed previously [1], we are planning to propose [2]
"graduating" the DataFusion to its own top level Apache project.
I would like to discuss the initial PMC members for the new top level
project. The suggestion in [1] is
> All existing Arrow Committers and PMC members
I am attempting to organize a face to face meetup of DataFusion community
if anyone is interested[1] and would like to solicit input from anyone who
has it
Andrew
[1]: https://github.com/apache/arrow-datafusion/discussions/8152
I would like to bring wider visibility to a proposal[1] to unify the
function interface in DataFusion.
The TLDR is to remove BuiltInScalarFunction and ensure all functions can
implemented as `ScalarUDF`
There is a small API change to ScalarUDF proposed as well[2].
If you are interested, please
Thanks Marko for the detailed answer and the references.
Cheers!
On Mon, Aug 21, 2023 at 1:57 PM Marko Grujic wrote:
> Hi Akshara,
>
> > Just for my understanding - the proposal assumes that writes will result
> in
> > a new table version correct?
>
> Actually, the implementation I had in mind
Hi Akshara,
> Just for my understanding - the proposal assumes that writes will result in
> a new table version correct?
Actually, the implementation I had in mind does not make any assumptions about
the
behaviour of writes, it only accounts for the fact that there may be different
versions of
Hi Marko,
Indeed most databases do support time travel/stale reads (specially
distributed databases) , hence an important feature,IMHO.
Just for my understanding - the proposal assumes that writes will result in
a new table version correct?
Asking since, some databases provide stale read support
Hi all!
I'm wondering what people think of a possibility to extend DataFusion so as
to accommodate time-travel querying? This would work well with the new
table formats, particularly Iceberg and Delta Lake, where table versioning
is at the core of the protocol.
You can see some details in the
The PR is now merged. Thank you very much for everyone who added their
commentary
On Thu, Aug 3, 2023 at 8:27 AM Andrew Lamb wrote:
> BTW I would like to bring attention to the following PR: [1]
>
> It has some non trivial changes:
> 1. date/time arithmetic is done using Durations (e.g. X
BTW I would like to bring attention to the following PR: [1]
It has some non trivial changes:
1. date/time arithmetic is done using Durations (e.g. X milliseconds)
rather than Intervals (e.g. "months") which makes behavior consistent
2. Changes the output type of Decimal128 division to avoid
I would like to invite anyone with opinions or perspectives from the
community to participate in two ongoing discussions about DataFusion and
its future.
* Move Apache Arrow Datafusion to a new top level Apache projection [1]
* Goals / Vision for DataFusion [2]
Thank you,
Andrew
[1]:
I would like to solicit opinions on a PR with proposed changes to the
DataFusion memory manager scheme [1]. If you would like to offer feedback,
please do so on the PR
Thank you,
Andrew
[1] https://github.com/apache/arrow-datafusion/pull/4522
I am hoping to start preparing for a release of datafusion 7.0.0 to
crates.io in the next few days: Let's coordinate on [1]
Thanks
Andrew
[1] https://github.com/apache/arrow-datafusion/issues/1587
Greetings fellow Rustaceans,
I wanted to bring some extra attention to a PR[1] contributed by Yijie Shen
that adds an initial resource management API to datafusion.
While the implementation will likely undergo the normal iteration,
iterating on the APIs will likely be more disruptive. Thus I
22 matches
Mail list logo