The blog post is now live at
https://arrow.apache.org/blog/2024/03/06/comet-donation/
On Thu, Feb 29, 2024 at 9:32 AM Andrew Lamb wrote:
> In case anyone is interested, we are working on a blog post related to
> this donation here [1]. All feedback more than welcome.
>
> [1] https://github.com/a
In case anyone is interested, we are working on a blog post related to
this donation here [1]. All feedback more than welcome.
[1] https://github.com/apache/arrow-site/pull/479
On Mon, Feb 12, 2024 at 1:37 AM Chao Sun wrote:
>
> Thank you all for the great support and interest on this project!
>
Thank you all for the great support and interest on this project!
On Sun, Feb 11, 2024 at 12:51 PM Wes McKinney wrote:
>
> Congrats all! It's great to see the Arrow+DataFusion ecosystem expand in
> this way and to bring the work under the ASF umbrella.
>
> On Sun, Feb 11, 2024 at 5:02 AM Andrew L
Congrats all! It's great to see the Arrow+DataFusion ecosystem expand in
this way and to bring the work under the ASF umbrella.
On Sun, Feb 11, 2024 at 5:02 AM Andrew Lamb wrote:
> As a follow up here the acceptance vote [1] has passed, the IP Clearance
> Process is complete [2] and the code PR
As a follow up here the acceptance vote [1] has passed, the IP Clearance
Process is complete [2] and the code PR is merged[3]!
It is a very exciting time! Congratulations to all involved
Andrew
[1]: https://lists.apache.org/thread/cyfyb96sssmpr73hhm7vh8jcdjbz8rsp
[2]: https://github.com/apache/a
For those that are interested wrt lang types/lines...
Language files blankcomment
code
Rust
Thanks Jacques and everyone here for the feedback! We just created a
PR https://github.com/apache/arrow-datafusion-comet/pull/1 for the
donation vote and IP clearance. Please take a look there and provide
your valuable comments.
Best,
Chao
On Thu, Jan 18, 2024 at 5:24 PM Jacques Nadeau wrote:
>
Yes, that was roughly what I was requesting (I was suggesting a single PR
with many commits that would be merged with the history).
It's hard to provide a more concrete opinion on this without seeing the
quantity and complexity of the code. If it's 5,000 lines of code, it
probably doesn't matter.
Hi Jacques,
Do you mean instead of a single PR, we modify (e.g., git commit amend)
all the commits that we have internally to remove any sensitive
information, and open PRs for them against the above repo?
I understand this will help readability and maintenance of the code,
but it will be a lot o
Thanks for the quick response Chao.
My experience on these things is that maintaining commit history for large
codebases can be invaluable for tracking down issues. (Hey, why is this
code written this way-- oh, it was part of x patch that was trying to
achieve y).
In the past, I've used git commi
Hi Andy and Jacques,
Thanks for setting the repo up. Yes we are working on cleaning up the
internal repo and preparing to open a PR in the next few days.
It's a bit difficult to retain the original commit history in the PR
though since some of them contain internal info which we need to
remove up
Hey Chao, it would be great for you to share the code some place with
commit history. (PR to the repo that Andy made or something else.)
On Mon, Jan 15, 2024 at 7:38 AM Andy Grove wrote:
> Hi Chao,
>
> I have created https://github.com/apache/arrow-datafusion-comet and you
> should be able to cr
Hi Chao,
I have created https://github.com/apache/arrow-datafusion-comet and you
should be able to create a PR against the repo.
Thanks,
Andy.
Andy.
On Fri, Jan 12, 2024 at 3:45 PM Chao Sun wrote:
> Thanks all for the positive support!
>
> Andy, we plan to name the project Comet (BTW if you
Thanks all for the positive support!
Andy, we plan to name the project Comet (BTW if you have better
suggestions please let us know). Could you help to create a repo named
arrow-datafusion-comet or arrow-comet? We'll clean up our internal
repo and prepare for the donation in the next few days. Tha
I think the next step here would be to create a new repo so that Chao can
create a PR for the contribution, and then we can proceed to a vote.
Chao - do you have a proposal for the name of the project? Given that this
is being donated to Apache Arrow, the repo name will start with "arrow-".
Also,
Like Andrew Lamb mentioned, blaze-rs has similar goals, I'd really be
interested to know some comparisons when the donations are made.
All in all, I look forward to the new native project for spark acceleration.
On Thu, Jan 11, 2024 at 9:50 PM Andrew Lamb wrote:
> I am very supportive of this do
It sounds like there is likely enough support for this to move forward, I'd
guess next steps are to work on the donation process/vote. Probably
someone more involved with DataFusion should help drive this effort?
On Thu, Jan 11, 2024 at 12:55 PM L. C. Hsieh wrote:
> Spark as a widely used compu
Spark as a widely used computation engine in industry, has its
momentum from developers and users.
I believe that the integration with DataFusion, not only can help
drive Spark through next level high performance with
a new native execution engine, but also can attract more developer
attention int
Full disclosure: I worked on the original value vector implementation that
became Apache arrow and currently work with Chao, et al on the native
engine that is being discussed.
I believe that integration of DataFusion with Spark will drive both
development and user interest in arrow-rs and DataFusi
I am very supportive of this donation. I know of at least one other
DataFusion-based project, blaze-rs[1], which has the same design goal and
bringing this project into the ASF may help consolidate these efforts
As Andy said, I believe it was very valuable to have a major consumer
project (e.g. Da
Hi Chao,
This sounds like a really interesting project. I am interested in seeing
how it compares to Spark RAPIDS (the project that I work on at NVIDIA) and
Intel's Gluten project (that works with Velox).
I can see the following benefits of having this project being under Apache
Arrow governance:
Thanks Micah for the quick response.
> Would Spark itself not be a reasonable place for this work?
We considered Spark as well but decided it is a better place to be
under Arrow given the project itself heavily tied with DataFusion. A
lot of the work in this project is to convert Spark physical p
Hi Chao,
Very cool. I think this is something that a lot of people are interested
in. I think the main questions I have are:
1. Would Spark itself not be a reasonable place for this work?
2. Do you anticipate this would move with DataFusion to its own top-level
project [1] if that happens or sta
23 matches
Mail list logo