+1 (non-binding)

On Fri, Mar 1, 2024 at 18:58 kazuyuki tanimura <ktanim...@apple.com.invalid>
wrote:

> +1 (non-binding)
>
> Kazu
>
> > On Mar 1, 2024, at 5:44 PM, L. C. Hsieh <vii...@gmail.com> wrote:
> >
> > +1 (binding)
> >
> > On Fri, Mar 1, 2024 at 1:25 PM Joris Van den Bossche
> > <jorisvandenboss...@gmail.com> wrote:
> >>
> >> +1 (binding)
> >>
> >> On Fri, 1 Mar 2024 at 22:18, Sutou Kouhei <k...@clear-code.com> wrote:
> >>>
> >>> +1
> >>>
> >>> In <CAFhtnRy2J9GCU6e2K56-KPVc=gawemuipeyhmnwcd+htkfa...@mail.gmail.com
> >
> >>>  "[VOTE] Move Arrow DataFusion Subproject to new Top Level Apache
> Project" on Fri, 1 Mar 2024 06:33:08 -0500,
> >>>  Andrew Lamb <al...@influxdata.com> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> As we have discussed[1][2] I would like to vote on the proposal to
> >>>> create a new Apache Top Level Project for DataFusion. The text of the
> >>>> proposed resolution and background document is copy/pasted below
> >>>>
> >>>> If the community is in favor of this, we plan to submit the resolution
> >>>> to the ASF board for approval with the next Arrow report (for the
> >>>> April 2024 board meeting).
> >>>>
> >>>> The vote will be open for at least 7 days.
> >>>>
> >>>> [ ] +1 Accept this Proposal
> >>>> [ ] +0
> >>>> [ ] -1 Do not accept this proposal because...
> >>>>
> >>>> Andrew
> >>>>
> >>>> [1] https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> >>>> [2] https://github.com/apache/arrow-datafusion/discussions/6475
> >>>>
> >>>> ---------- Proposed Resolution ---------
> >>>>
> >>>> Resolution to Create the Apache DataFusion Project from the Apache
> >>>> Arrow DataFusion Sub Project
> >>>>
> >>>> =============================================================
> >>>>
> >>>> X. Establish the Apache DataFusion Project
> >>>>
> >>>> WHEREAS, the Board of Directors deems it to be in the best
> >>>> interests of the Foundation and consistent with the
> >>>> Foundation's purpose to establish a Project Management
> >>>> Committee charged with the creation and maintenance of
> >>>> open-source software related to an extensible query engine
> >>>> for distribution at no charge to the public.
> >>>>
> >>>> NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> >>>> Committee (PMC), to be known as the "Apache DataFusion Project",
> >>>> be and hereby is established pursuant to Bylaws of the
> >>>> Foundation; and be it further
> >>>>
> >>>> RESOLVED, that the Apache DataFusion Project be and hereby is
> >>>> responsible for the creation and maintenance of software
> >>>> related to an extensible query engine; and be it further
> >>>>
> >>>> RESOLVED, that the office of "Vice President, Apache DataFusion" be
> >>>> and hereby is created, the person holding such office to
> >>>> serve at the direction of the Board of Directors as the chair
> >>>> of the Apache DataFusion Project, and to have primary responsibility
> >>>> for management of the projects within the scope of
> >>>> responsibility of the Apache DataFusion Project; and be it further
> >>>>
> >>>> RESOLVED, that the persons listed immediately below be and
> >>>> hereby are appointed to serve as the initial members of the
> >>>> Apache DataFusion Project:
> >>>>
> >>>> * Andy Grove (agr...@apache.org)
> >>>> * Andrew Lamb (al...@apache.org)
> >>>> * Daniël Heres (dhe...@apache.org)
> >>>> * Jie Wen (jake...@apache.org)
> >>>> * Kun Liu (liu...@apache.org)
> >>>> * Liang-Chi Hsieh (vii...@apache.org)
> >>>> * Qingping Hou: (ho...@apache.org)
> >>>> * Wes McKinney(w...@apache.org)
> >>>> * Will Jones (wjones...@apache.org)
> >>>>
> >>>> RESOLVED, that the Apache DataFusion Project be and hereby
> >>>> is tasked with the migration and rationalization of the Apache
> >>>> Arrow DataFusion sub-project; and be it further
> >>>>
> >>>> RESOLVED, that all responsibilities pertaining to the Apache
> >>>> Arrow DataFusion sub-project encumbered upon the
> >>>> Apache Arrow Project are hereafter discharged.
> >>>>
> >>>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Andrew Lamb
> >>>> be appointed to the office of Vice President, Apache DataFusion, to
> >>>> serve in accordance with and subject to the direction of the
> >>>> Board of Directors and the Bylaws of the Foundation until
> >>>> death, resignation, retirement, removal or disqualification,
> >>>> or until a successor is appointed.
> >>>> =============================================================
> >>>>
> >>>>
> >>>> -------
> >>>>
> >>>>
> >>>> Summary:
> >>>>
> >>>> We propose creating a new top level project, Apache DataFusion, from
> >>>> an existing sub project of Apache Arrow to facilitate additional
> >>>> community and project growth.
> >>>>
> >>>> Abstract
> >>>>
> >>>> Apache Arrow DataFusion[1]  is a very fast, extensible query engine
> >>>> for building high-quality data-centric systems in Rust, using the
> >>>> Apache Arrow in-memory format. DataFusion offers SQL and Dataframe
> >>>> APIs, excellent performance, built-in support for CSV, Parquet, JSON,
> >>>> and Avro, extensive customization, and a great community.
> >>>>
> >>>> [1] https://arrow.apache.org/datafusion/
> >>>>
> >>>>
> >>>> Proposal
> >>>>
> >>>> We propose creating a new top level ASF project, Apache DataFusion,
> >>>> governed initially by a subset of the Apache Arrow project’s PMC and
> >>>> committers. The project’s code is in five existing git repositories,
> >>>> currently governed by Apache Arrow which would transfer to the new top
> >>>> level project.
> >>>>
> >>>> Background
> >>>>
> >>>> When DataFusion was initially donated to the Arrow project, it did not
> >>>> have a strong enough community to stand on its own. It has since grown
> >>>> significantly, and benefited immensely from being part of Arrow and
> >>>> nurturing of the Apache Way, and now has a community strong enough to
> >>>> stand on its own and that would benefit from focused governance
> >>>> attention.
> >>>>
> >>>> The community has discussed this idea publicly for more than 6 months
> >>>> https://github.com/apache/arrow-datafusion/discussions/6475  and
> >>>> briefly on the Arrow PMC mailing list
> >>>> https://lists.apache.org/thread/thv2jdm6640l6gm88hy8jhk5prjww0cs. As
> >>>> of the time of this writing both had exclusively positive reactions.
> >>>>
> >>>> Several current members of the Arrow PMC are both active contributors
> >>>> to DataFusion and understand and believe deeply in the Apache Way, and
> >>>> play active governance roles in the Arrow project as PMC members and
> >>>> PMC chairs, guiding the community, and releasing software versions.
> >>>> With this existing governance experience and structure, the new top
> >>>> level project will be able to function well immediately and
> >>>> independently.
> >>>>
> >>>> Overview of DataFusion
> >>>>
> >>>> Current Status
> >>>>
> >>>> Meritocracy
> >>>>
> >>>> DataFusion has been developed as part of Apache Arrow and thus has
> >>>> been operating as a meritocracy. Many of the developers of DataFusion
> >>>> are Arrow PMC members or committers. The DataFusion project plans to
> >>>> continue adding new PMC and committers as the project matures and
> >>>> grows.
> >>>>
> >>>> Community
> >>>>
> >>>> The DataFusion development team seeks to foster the development and
> >>>> user communities. We hope that becoming a separate project will help
> >>>> both Arrow and DataFusion communities by being more focused.  Focused
> >>>> governance will make it easier to grow the community of committers and
> >>>> PMC members and make the organization more clear to others.
> >>>>
> >>>> Alignment
> >>>>
> >>>> The ASF is a natural host for DataFusion given that it is already the
> >>>> home of Arrow, Parquet, and other related distributed system, storage
> >>>> and query execution systems.
> >>>>
> >>>> Project Leadership
> >>>>
> >>>> Proposed Initial PMC
> >>>>
> >>>> We propose the following people as the initial DataFusion PMC members.
> >>>> This is a subset of the existing Arrow PMC members who contribute to
> >>>> DataFusion https://people.apache.org/phonebook.html?unix=arrow
> >>>>
> >>>> Andy Grove (agrove):  Arrow PMC Chair
> >>>> Andrew Lamb (alamb): Arrow PMC, past Arrow PMC Chair
> >>>> Daniël Heres (dheres) Arrow PMC
> >>>> Jie Wen (jakevin):  Arrow PMC, Doris Committer
> >>>> Kun Liu (liukun): Arrow PMC, IoTDB PMC, TSFile PMC
> >>>> Liang-Chi Hsieh (viirya): Arrow PMC, Spark PMC
> >>>> Qingping Hou: (houqp): Arrow PMC
> >>>> Wes McKinney(wesm): Arrow PMC, ASF Member
> >>>> Will Jones (wjones127): Arrow PMC
> >>>>
> >>>> We’d like to propose Andrew Lamb as the initial Chair, (and thus ASF
> >>>> VP) for the DataFusion project.
> >>>>
> >>>> Affiliations
> >>>>
> >>>> Andy Grove (agrove):  NVidia
> >>>> Andrew Lamb (alamb): InfluxData
> >>>> Daniël Heres (dheres): Coralogix
> >>>> Jie Wen (jakevin): SelectDB
> >>>> Kun Liu (liukun): Ebay
> >>>> Liang-Chi Hsieh (viirya): Apple
> >>>> Qingping Hou: (houqp): Scribd
> >>>> Wes McKinney(wesm): Posit
> >>>> Will Jones (wjones127): LanceDB
> >>>>
> >>>> Proposed Initial Committers
> >>>>
> >>>> In addition to the PMC, we propose the following people as the initial
> >>>> DataFusion committers. This is a subset of the existing Arrow
> >>>> committers who contribute to DataFusion
> >>>> https://people.apache.org/phonebook.html?unix=arrow
> >>>>
> >>>> akurmustafa Mustafa Akur (Synnada)
> >>>> avantgardner Brent Gardner (Coralogix)
> >>>> comphead Oleks V. (Unaffiliated)
> >>>> jayzhan Jay Zhan (Unaffiliated)
> >>>> jeffreyvo Jeffry Vo (Unaffiliated)
> >>>> jiayuliu Liu Jiayu (Airbnb)
> >>>> mete Metehan Yildirim (Synnada)
> >>>> mingmwang Wang Mingming (Ebay)
> >>>> mneumann Marco Neumann (InfluxData)
> >>>> nju_yaho Zhong Yanghong (Ebay)
> >>>> ozankabak Mehmet Ozan Kabak (Synnada)
> >>>> paddyhoran Paddy Horan (Assured Allies)
> >>>> rdettai Rémi Dettai (Cloudfuse)
> >>>> sunchao Chao Sun (Apple)
> >>>> thinkharderdev Daniel Harris (Coralogix)
> >>>> tustvold Raphael Taylor-Davies (InfluxData)
> >>>> wayne Ruihang Xia (Greptime)
> >>>> xudong963 Xudong Wang (ByteDance)
> >>>> yjshen Yijie Shen (Space and Time)
> >>>> yangjiang Yang Jiang (ebay)
> >>>>
> >>>>
> >>>> Risk Assessments
> >>>>
> >>>> Naming / Trademarks
> >>>>
> >>>> As a sub-project of Arrow, the DataFusion name has been used for over
> >>>> 4 years without any known issues. A podling name search did not turn
> >>>> up any concerns and was approved:
> >>>> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> >>>>
> >>>> Legal / IP Clearance
> >>>>
> >>>> All DataFusion code has either been donated to the Arrow project with
> >>>> appropriate IP clearance or  has been developed directly under ASF
> >>>> processes and procedures. Thus creating a new top level project poses
> >>>> no new Legal or IP risks.
> >>>>
> >>>> Code Extraction
> >>>>
> >>>> The relevant code is already in 5 separate repositories:
> >>>> https://github.com/apache/arrow-datafusion/
> >>>> https://github.com/apache/arrow-datafusion-python
> >>>> https://github.com/apache/arrow-ballista
> >>>> https://github.com/apache/arrow-ballista-python
> >>>> https://github.com/apache/arrow-datafusion-comet
> >>>>
> >>>> We foresee no issues with code extraction and propose these
> >>>> repositories be  renamed to reflect top level projects
> >>>>
> >>>> Note:  https://github.com/apache/arrow-rs, the Rust implementation of
> >>>> Arrow, would remain part of the Arrow project.
> >>>>
> >>>> Orphaned Products
> >>>>
> >>>> DataFusion is known to be used in many open source and commercial
> >>>> projects
> https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users
> ,
> >>>> has had multiple commits daily for several years, and its adoption and
> >>>> number of contributors appears to be growing. We do not foresee the
> >>>> project being orphaned in the next several years.
> >>>>
> >>>> Inexperience with Open Source
> >>>>
> >>>> The proposed PMC has extensive experience with Apache Arrow and other
> >>>> Apache projects, and includes PMC members, PMC chairs and an ASF
> >>>> Member. The DataFusion PMC and more experienced committers will
> >>>> continue to coach new community members who may be less familiar with
> >>>> the Apache Way.
> >>>>
> >>>> Homogeneous Developers
> >>>>
> >>>> The 9 proposed PMC members are from 9 different employers and the
> >>>> proposed committers are similarly distributed across affiliations. No
> >>>> specific entity employs more than 3 total proposed developers.
> >>>>
> >>>> Reliance on Salaried Developers
> >>>>
> >>>> A substantial amount of work on DataFusion has been by salaried
> >>>> developers, but it also has a long tradition of attracting
> >>>> contributions from students and hobbyists and we plan no changes in
> >>>> contribution structure.
> >>>>
> >>>> Relationships with Other Apache Products
> >>>>
> >>>> DataFusion will obviously have a strong relationship with the Arrow
> >>>> project given the overlap in people. We don’t foresee close
> >>>> collaboration with other projects at this time.
> >>>>
> >>>> Cryptography
> >>>>
> >>>> DataFusion does not directly support encryption and there are no
> >>>> near-term plans to add support for encryption. Users who need this
> >>>> functionality can use the extension APIs.
> >>>>
> >>>> Required Resources
> >>>>
> >>>> Mailing Lists
> >>>>
> >>>> - priv...@datafusion.apache.org for private PMC discussions (with
> >>>> moderated subscriptions)
> >>>> - d...@datafusion.apache.org
> >>>> - comm...@datafusion.apache.org
> >>>> - u...@datafusion.apache.org
> >>>>
> >>>> Version Control
> >>>>
> >>>> We propose to continue to use git for source control and github for
> >>>> hosting and testing resources.
> >>>>
> >>>> We also need to rename the github repositories to reflect the new top
> >>>> level names:
> >>>>
> >>>> https://github.com/apache/arrow-datafusion/ → apache/datafusion
> >>>> https://github.com/apache/arrow-datafusion-python →
> apache/datafusion-python
> >>>> https://github.com/apache/arrow-ballista → apache/datafusion-ballista
> >>>> https://github.com/apache/arrow-ballista-python  →
> >>>> apache/datafusion-ballista-python
> >>>> https://github.com/apache/arrow-datafusion-comet →
> apache/datafusion-comet
> >>>>
> >>>>
> >>>>
> >>>> Issue Tracking
> >>>>
> >>>> DataFusion would continue to use github for its issue tracking and
> >>>> communications
> >>>>
> >>>> Other Resources
> >>>>
> >>>> The existing repositories already make use of existing Apache
> >>>> infrastructure, and we expect no change in the initial resource usage.
> >>>> As the project continues to grow, we expect continued infrastructure
> >>>> demand growth.
> >>>>
> >>>>
> >>>> FAQ: Has a sub project been promoted to a top level project before?
> >>>>
> >>>> Yes, and it appears to happen commonly. The Arrow project itself was
> >>>> created as a top level project from work that started in Apache Drill,
> >>>> and there are many sub projects of Hadoop that spun out as their own
> >>>> top level projects such as Mahout, Avro and HBase:
> >>>>
> https://news.apache.org/foundation/entry/the_apache_software_foundation_announces4
> >>>>
> >>>>
> >>>>
> >>>> Related material:
> >>>> Name search request / research for DataFusion:
> >>>> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> >>>> Discussion about this proposal on the arrow mailing list:
> >>>> https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> >>>> Discussion about which repositories on the arrow mailing list:
> >>>> https://lists.apache.org/thread/ob3n0d9ky0bgrryl3xn39w9k566bq00q
> >>>> Discussion about initial PMC on the arrow mailing list:
> >>>> https://lists.apache.org/thread/pymrzcdw4qdptvby85f69rg3pcckl15b
> >>>> Discussion in github about creating a new DataFusion top level
> >>>> project: https://github.com/apache/arrow-datafusion/discussions/6475
> >>>> Discussion about graduating on incubator list:
> >>>> https://lists.apache.org/thread/r4n73pmms1lv0jbohyx1o1z13d615t99
> >>>> Original Proposal for the Arrow project:
> >>>> https://lists.apache.org/thread/x2qzdwglm8pkqp9gv03bbgw17khl7pq3
>
>

Reply via email to