Re: [DISCUSS] The state of the project

Josh Elser Wed, 05 Oct 2016 10:59:53 -0700

Anil/Jungtaek,

Thanks for the feedback!

It would be great if we could spin off this part of the discussion intoconcrete ways that "learning core concepts" could be improved since youboth have experience. A new email thread to outline what is lacking,what is necessary to know, and how best to convey that knowledge withthe deliverable of some JIRA issue(s) (with patches too!) would be awesome.


- Josh

Jungtaek Lim wrote:

Thanks all for putting efforts on maintaining and improving amazing project.
Storm SQL heavily relies on Calcite, and it really gives lots of benefits.

Btw, I second CPC's opinion, since some areas like using Calcite as JDBC
driver (and adapter) are well documented, but other areas which should
understand core concepts (like integrating Calcite to other project) are
not documented and up to individual's understanding of Calcite.

Coincidentally, the slide Jordan shared today is a great slide to explain
core concepts of Calcite. Since its content is fit to presentation, it
would be great if we have doc. version of slide (I mean more texts and
explanations) to website.

Thanks again,
Jungtaek Lim (HeartSaVioR)

2016년 10월 5일 (수) 오후 3:27, CPC<acha...@gmail.com>님이 작성:

I think calcite is an amazing project and let me thank you for all your
efforts.To bring more users and new committers i think documentation is
really important. As a user it took waste amount of my time to understand
concepts and other things. Because without some core concepts it is hard to
find where to look for. I think it will be good to have some docs regarding
core concepts and their representations in calcite.

Regards...
Anil halil

On Oct 5, 2016 02:57, "Josh Elser"<els...@apache.org>  wrote:


Julian Hyde wrote:

Hi Calcite community members,

In a few weeks (22nd October) it will be a year since Calcite graduated
to a top-level Apache project[1]. I think it’s been a good year!

When we graduated, we decided to have an annual “state of the project”
discussion and to vote for a new PMC chair/VP[2]. So, I’m kicking off

both

of those discussions.

First, a few of my thoughts.

I am pleased with the general rate of progress of the project. I’m
pleased to see an increasing number of contributions from new

contributors,

and some of those becoming committers and PMC members. A couple of
highlights this year were adapters for Cassandra and Elasticsearch that

can

out of the blue. I’m also pleased that we have continued a regular

release

cadence. This makes it easier for projects to use Calcite, and knowing

that

pull requests will be promptly reviewed and included in a release gives
people an incentive to contribute.

Calcite is becoming an ever better optimizer for SQL queries. This is
helped immeasurably by the fact that Hive, Phoenix, Drill, Qubole and
others are using Calcite for this and are contributing back. (Thanks to
those communities for their continued collaboration!)

+1 this has been awesome to watch :)

But I also believe that Calcite can be used for non-traditional databases.

Some examples:

1. I am a fan of what Drill have done with schema-less query processing
and document-oriented data, and would like to bring similar functionality
into core Calcite.

I remember I saw a presentation by someone on Drill a while back (very
much a "intro to drill" by someone not affiliated with Calcite either).

The

way the content was presented it was so very clear the influences of
Calcite into their architecture. Very cool to see!

2. I also like the idea of Calcite being a “toolkit” from which one can

build a database (relational or non-relational). Phoenix have been going
through the process of converting their existing parser&   planner to use
Calcite, and I have learned a lot. But a lot still needs to be done to

make

Calcite easier to use as a framework.

3. I have been building consensus that SQL is a great language for stream
processing[3], and working with Apex, Flink, Samza, Storm to build the
pieces to implement streaming SQL. I am very excited about the way
streaming SQL is gaining acceptance. Are there any other emerging areas
should Calcite be targeting?

Avatica continues to grow and mature. The Avatica site now lists clients
in 4 languages[4], and there is also an ODBC driver (not open source)[5].
The “one repo, one community, two web sites, two releases” strategy seems
to be working adequately. But where do we see the project going? Would it
help if it had its own namespace (org.apache.avatica) or web site (
http://avatica.apache.org<http://avatica.apache.org/>)? Might it be a
top-level project someday?

I think, in time, Avatica could easily grow into its own entity. I don't
think we're there yet.

I will say that I think there's been a regular amount of confusion with
Avatica and Calcite sharing a repository but not following the same
versioning scheme. People seem to be a bit confused when I tell them that
the two projects are not "attached" to one another (they are separate

Maven

projects).

I think pulling Avatica into its own repo would be good encouragement to
being its own entity (as well as drawing the line between Calcite and
Avatica codebases), but I think this is low-priority (as there are few of
us doing Avatica work) and we need to do a better job at clearly stating
what Avatica is/does and its API.

Regarding community. Are we doing enough to reach out and bring new

members into the community? Some of us have given talks at conferences

and

meetups over the last 12 months. Could we improve our geographical reach?
Are there other things we could do to make the project more welcoming to
new contributors? Could we do more to reach out to women and other
demographic groups underrepresented in our community?

What else are we doing well in the project? What are areas where we need
to do better?

As you outlined, adoption across other projects has been great. What about
adoption by users? I know the last time I tried to hack some SQL system
together with Calcite (albeit, quite a while ago), I was left wondering
what is "public API" (what are the classes I should use versus what are
those that are "internal"). I think we still see a fair amount of requests
for "hand-holding" as well. I'm not sure how we make this better (or if
it's a best use of time -- the csv example goes very far already!). Just a
comment.

Lastly, since I agreed to step down as VP after 12 months, let’s start

talking about a replacement. Being PMC chair is a privilege and it has
taught me a huge amount about how Apache works. I think that Jesús

Camacho

Rodríguez could do an excellent job, if he is willing. Which other
candidates should we consider?

+1 to Jesús if he's amenable to it. He's been a pleasure to work with and
I'd have no complaints. Also happy to entertain others who would like to
step up (without volunteering them myself ;P)

Please take some time to share your thoughts about the state of the

project.

Julian

(VP Apache Calcite)

[1] http://calcite.apache.org/news/2015/10/22/calcite-graduates/<
http://calcite.apache.org/news/2015/10/22/calcite-graduates/>

[2] http://mail-archives.apache.org/mod_mbox/incubator-calcite-
dev/201509.mbox/%3CCF8D6F96-706F-4502-B41D-0689E357209D%40apache.org%3E<
http://mail-archives.apache.org/mod_mbox/incubator-calcite-dev/201509.
mbox/%3ccf8d6f96-706f-4502-b41d-0689e3572...@apache.org%3E>

[3] https://calcite.apache.org/community/#streaming-sql<https://
calcite.apache.org/community/#streaming-sql>

[4] http://calcite.apache.org/avatica/docs/#clients<http://calci
te.apache.org/avatica/docs/#clients>

[5] https://hortonworks.com/hadoop-tutorial/bi-apache-phoenix-odbc/<
https://hortonworks.com/hadoop-tutorial/bi-apache-phoenix-odbc/>

Re: [DISCUSS] The state of the project

Reply via email to