houqp commented on a change in pull request #1104:
URL: https://github.com/apache/arrow-datafusion/pull/1104#discussion_r730534093



##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)

Review comment:
       It looks like the arrow's contributing guide is not applicable to us 
anymore, for example, it mentioned use of JIRA in many places. Perhaps we can 
just delete the last sentence.

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer
+
+The
+
+## Additional SQL Language Features
+
+- Complete support list on 
[status](https://github.com/apache/arrow-datafusion/blob/master/README.md#status)
+- Timestamp Arithmetic 
[#194](https://github.com/apache/arrow-datafusion/issues/194)
+- SQL Parser extension point
+- Support for nested structures (fields, lists, structs)

Review comment:
       ```suggestion
   - Support for nested structures (fields, lists, structs) 
[#119](https://github.com/apache/arrow-datafusion/issues/119)
   ```

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer
+
+The

Review comment:
       Unfinished sentence?

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer

Review comment:
       What do you think about adding one of Rust's core strengths in this 
sentence?
   
   ```suggestion
   4. High performance, data race free, erogonomic extensibility points at at 
every layer
   ```

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer
+
+The
+
+## Additional SQL Language Features
+
+- Complete support list on 
[status](https://github.com/apache/arrow-datafusion/blob/master/README.md#status)
+- Timestamp Arithmetic 
[#194](https://github.com/apache/arrow-datafusion/issues/194)
+- SQL Parser extension point
+- Support for nested structures (fields, lists, structs)
+- Remaining Set Operators (`INTERSECT` / `EXCEPT`) 
[#1082](https://github.com/apache/arrow-datafusion/issues/1082)
+- Run all queries from the TPCH benchmark (see 
[milestone](https://github.com/apache/arrow-datafusion/milestone/2) for more 
details)
+
+## Query Optimizer
+
+- Additional constant folding / partial evaluation 
[#1070](https://github.com/apache/arrow-datafusion/issues/1070)
+- More sophisticated cost based optimizer for join ordering
+
+## Runtime / Infrastructure
+
+- Better support for reading data from remote filesystems (e.g. S3) without 
caching it locally
+- Migrate to some sort of arrow2 based implementation (see 
[milestone](https://github.com/apache/arrow-datafusion/milestone/3) for more 
details)
+

Review comment:
       ```suggestion
   - Add DataFusion to h2oai/db-benchmark 
[147](https://github.com/apache/arrow-datafusion/issues/147)
   - Improve build time 
[348](https://github.com/apache/arrow-datafusion/issues/348)
   ```

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer
+
+The
+
+## Additional SQL Language Features
+
+- Complete support list on 
[status](https://github.com/apache/arrow-datafusion/blob/master/README.md#status)
+- Timestamp Arithmetic 
[#194](https://github.com/apache/arrow-datafusion/issues/194)
+- SQL Parser extension point

Review comment:
       ```suggestion
   - SQL Parser extension point 
[#533](https://github.com/apache/arrow-datafusion/issues/533)
   ```

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans

Review comment:
       I think it would be good to mention our dataframe api as well for those 
who come from pandas and spark world. Or is it considered part of the plan 
management procedural interface?

##########
File path: docs/source/specification/roadmap.md
##########
@@ -0,0 +1,92 @@
+\<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely driven by
+volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you read the [before
+starting](https://arrow.apache.org/docs/developers/contributing.html#before-starting)
+recommendations to minimize surprises during code review.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache 
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A feature-complete declarative SQL query interface compatible with 
PostgreSQL
+3. A feature-rich procedural interface for creating and running execution plans
+4. High performance, erogonomic extensibility points at at every layer
+
+The
+
+## Additional SQL Language Features
+
+- Complete support list on 
[status](https://github.com/apache/arrow-datafusion/blob/master/README.md#status)
+- Timestamp Arithmetic 
[#194](https://github.com/apache/arrow-datafusion/issues/194)
+- SQL Parser extension point
+- Support for nested structures (fields, lists, structs)
+- Remaining Set Operators (`INTERSECT` / `EXCEPT`) 
[#1082](https://github.com/apache/arrow-datafusion/issues/1082)
+- Run all queries from the TPCH benchmark (see 
[milestone](https://github.com/apache/arrow-datafusion/milestone/2) for more 
details)
+
+## Query Optimizer
+
+- Additional constant folding / partial evaluation 
[#1070](https://github.com/apache/arrow-datafusion/issues/1070)
+- More sophisticated cost based optimizer for join ordering
+
+## Runtime / Infrastructure
+
+- Better support for reading data from remote filesystems (e.g. S3) without 
caching it locally

Review comment:
       ```suggestion
   - Better support for reading data from remote filesystems (e.g. S3) without 
caching it locally 
[#907](https://github.com/apache/arrow-datafusion/issues/907) 
[#1060](https://github.com/apache/arrow-datafusion/issues/1060)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to