Hi,

Currently, we have calcite-avatica and calcite in different repositories.
Frankly speaking, I do not know what it brings, however, it does create
points of friction:
1) If a feature touches Avatica and Calcite, then PRs are hard to create
and maintain. We just can't create a single PR across both repositories
2) If we support a single Avatica version only in Calcite, then the point
of having different repositories is even mooter.
3) CI configuration is basically duplicated: every time we want to add a
new JDK (once every 6 times), we have to do it twice
4) There are common dependencies: JUnit, hamcrest, etc, etc. We basically
have to do the same thing multiple times when upgrading versions in avatica
and calcite
5) Adding @Nullable annotations to Calcite was more complicated than I
wanted because Avatica is stored in a different repository.
I basically had to create a bunch of astub files instead of just putting
the relevant @nullable annotations on top of Avatica classes:
https://github.com/apache/calcite/tree/f1db79fb876ac9ba3c405283e99bb0438e4e97be/src/main/config/checkerframework/avatica

Recently there was a PR that improves error messages in Avatica:
https://github.com/apache/calcite-avatica/pull/161
I am sure the PR is a great improvement, however, it fails CI in both cases:
a) Current Avatica fails when it runs integration tests against Calcite
(because Calcite expects old, low-detail exception messages)
b) Current Calcite fails to build with "latest Avatica" because, well,
Avatica produces "too good" exception messages

It surfaces a true problem: we have too tight code integration between
"different" systems, and it probably makes sense to have both libraries in
a single repository.

An alternative option is to make sure Calcite "supports" at least two
Avatica versions: "previous version + one new".

However, the current tests in Calcite expect a specific error message, so
it can't support two alternative messges.
Well, the tests are in .iq format which could probably support multiple
messages, however, I have absolutely no idea how to implement that.

Facts so far:

* Avatica has fewer commits than Calcite, so having a separate
calcite-avatica repository does not help for segregating PR/issue/commit
queue
* Calcite seems to support one specific Avatica version only, so it makes
sense to just keep them in a single repository
* calcite-avatica-go seems to reside in its own repository, so I do not see
why do we split Java implementations across calcite and calcite-avatica
repository
* There is non-trivial maintenance overhead (see 1..5 above). Frankly
speaking, I was trying my best to **avoid** maintaining calcite-avatica.
Somebody wanted to go into a separate repository, so, I let them do what
they want there.
However, there are cases when I have to spend extra time because
calcite-avatica is a separate repository (PR161, @Nullable are the recent
samples)
* It looks like I broke the build by merging PR#161. That is why I am
trying to roll the thing forward and bring this discussion.
An alternative option is I revert the merge and wait for somebody else to
pick up the task.

So my questions are:

Q1) Does having calcite-avatica as a separate repository do anybody any
good?
Q2) Does anybody object to merging calcite-avatica and calcite into a
single calcite repository?

Vladimir

Reply via email to