andygrove commented on issue #23194: URL: https://github.com/apache/datafusion/issues/23194#issuecomment-4857444576
> We (Datadog), will gladly donate https://github.com/datafusion-contrib/datafusion-distributed to Apache if that implies hosting it as a new crate in https://github.com/apache/datafusion. We've built the crate leaving that door open from the beginning, both from a philosophy and code standpoint. > > We've discussed this in the past, but at that moment it was not the right time. Now that we've been running it in production at a huge scale for a while and we can no longer break things and move fast, the door is open. Thanks @gabotechs . I don't think it would be appropriate to host a complete distributed query engine (ballista or datafusion-distributed) in the core datafusion repo. Donating it to the project as a standalone repo is a different discussion (I'd be happy to comment on that too, if that is useful). My proposal was to add a minimal amount of code to help catch regressions that break the distributed model. I'm going to study datafusion-distributed over the next couple of days to understand the architecture and how it varies with Ballista so that I can make sure that I am not proposing something that would help with Ballista and not with datafusion-distributed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
