from:"Vladimir Sitnikov"

>[2]
http://ceki.blogspot.com/2010/05/forces-and-vulnerabilites-of-apache.html

^^ It is a must-read indeed.
I wish I read it much earlier than a week ago.

Julian, as I see you did not reply to [1]. It implies you ignore my
suggestions in that thread, or you think they are worse than what you
suggest.

Jokes aside, I see no way how "manual updates of the latest tested java in
all the markdown files" (Julian's idea) could win over "single place to
declare the version and let the machine do the substitution in all the .md
files" (my idea).
Just in case, Julian never asked me if I would volunteer implementing the
autoreplacer.

Julian, I was scared you quit, and my last hope to heal the community was
to organize a call to discuss Ceki's story and try to avoid it in Calcite.

I quit as I see there's an impedance mismatch.

Thank you for inviting me, that was a nice experience.
I learned a lot during these years.

I would like to thank Stamatis a lot as he has done a lot for improving the
community via meetups, nominating committers, and helping others to get out
of the conflicts.

The thing I dislike in Calcite community the most
is that **very** often people start questioning (or even reject) ideas,
they provide no alternatives, and they do not seem to listen.
I agree it is wise to give every idea -100 points at the beginning, and
then decide if the benefits make the net score positive.
However, the community seem to ignore benefits.

I absolutely do not want to hear risks like "oh, adding a new language is a
serious committment, so we should not add X" or "jira is fine, no changes
please".
Those are generic risks, and I believe they add no value to the
conversations.

What scares me A LOT is that I see a very similar pattern in JMeter
community (I'm a member of PMC there), however, the number of active
contributors is way less than in Calcite, so the friction is way less.

Do we really need to discuss the upgrade of junit4 to junit5?
Do we really need to discuss adding fuzzing tests (e.g. like in my
suggestion to use randomized CI matrix)?
Hey, fuzzing is one of the key testing approaches nowadays. Yet Calcite
community rejects it and they fear that random failures are bad.
Come on. You can do better.

Do we even need to discuss that the release notes should be autogenerated
(see release drafter)?
In my opinion, there's nothing to discuss and nothing to object, and the
only question can be "who volunteers" doing that. I often implemented those
types of changes, yet the discussions were far from welcoming.

Of course, the experience was not bad every day.
For instance, "Maven -> Gradle" migration  got a lot of positive feedback
immediately.
https://issues.apache.org/jira/browse/CALCITE-2905
https://lists.apache.org/thread/jw3ovlv5tp3vjcp0o5ztcv8yrzd2okl9
^^ this was a joy

However, as I presented the first prototype, I got a really surprising
response:
"This is not an improvement because it adds a new Kotlin language in a form
of .kts files"
https://lists.apache.org/thread/plb3rjypqrwo34qr3o2fdh2qqofx04y9
^^ this is not fun.
I know what I am doing. I am sure the scripts I added are readable and
understandable.
I know I use up-to-date approaches.
Please, what is the reason to mention a generic risk of "adding a new
language"?

-

Later I rolled @nullable annotations via Checkerframework
https://issues.apache.org/jira/browse/CALCITE-4199

I suggest everybody take a break, think for a couple of seconds and decide
if @nullable annotations in Calcite code
turned out to be helpful or unhelpful (for both developing Calcite, and
using Calcite).

>From my point of view, it was super frustrating to receive feedback like in
https://lists.apache.org/thread/tyxqydxwbt6lkokgobjok083nxqjrhlx
>Vladimir has been trying to fix things that aren’t broken

My bad I did not try contacting Julian directly to align approaches to
commits, reviews and broken/nonbroken things.

Of course, if you consider @nullability annotations harmful and/or useless,
all of them can be removed in a matter of seconds (remove annotations; make
requireNonNull methods trivial and call "inline method in IDEA"; optimize
imports; etc)

Of course, it would be better to gradually migrate to Kotlin (it does not
require lengthy checkerframework verifier), however I just assumed the idea
would be killed no matter what.
For instance, Kotlin has the official style guide, so it makes many
checkstyle verifications obsolete == much less failures like "you missed a
space after comma".



Recently I suggested "migrating to GitHub Issues".
https://lists.apache.org/thread/m2h13v2p7kowglj73qr4sqn1c3pm8tlq

Technically speaking, the results in the DISCUSS thread were not that bad
+1 Vladimir Sitnikov
+0.5 Zhe Hu
+0.5 Jing Zhang
+1 Chunwei Lei
+0 Michael Mior
-0 Josh Elser
(no vote) Stamatis Zampetakis
-0.47 Jacques Nadeau
(no v

Re: Why not make the "test" namespace public, or move to a separate Maven module that main pulls in as a "testOnly" dependency?

Gavin,

Please check sample schemata in
https://github.com/apache/calcite/tree/7a7b37fc1f4bd2121643539b0fd8e9c4f0f8ed09/testkit/src/main/java/org/apache/calcite/test/schemata

If you find useful schemas, don't hesitate filing a PR to add samples to
/testkit/

Vladimir

Re: Release notes preparation via release-drafter

>I'm afraid this is not accurate.

Nothing is accurate, so what do you suggest?

Vladimir

Re: Release notes preparation via release-drafter

>This is an accurate label ? I just want to confirm.

Please rephrase. I don't understand it.

Vladimir

Re: Release notes preparation via release-drafter

>This will request to add another Label to every PR

Release drafter can automatically add labels based on the filenames.

PRs without labels are still acceptable, so having labels is not a
requirement.

Vladimir

Release notes preparation via release-drafter

Hi,

I suggest we start using Release Drafter [1] for the preparation of the
release notes.
I'm going to submit and merge a PR soon, however, I would appreciate it if
somebody else
would do that (see [2] for a minimal commit)

I am sure it would significantly increase the quality of the release notes
and it would make releases easier to prepare.

Jenkins team incepted release-drafter, and they use it a lot.


Here's a sample configuration for Jenkins projects:

https://github.com/jenkinsci/.github/blob/f1a34987f2919f039282a13b8741e3c463d18bc6/.github/release-drafter.yml


Here are the results:
https://github.com/jenkinsci/configuration-as-code-plugin/releases

[1] https://github.com/release-drafter/release-drafter
[2]
https://github.com/vlsi/ksar/commit/ec210ffb7d0f91631908d01976c8bb4d5f7c4e0d

Vladimir

Re: [DISCUSS] Java 16 and 17 support

Julian,

Please refrain from insisting that others should follow the
manual procedures you create.
If you want to have certain manual steps for your own safety, that is fine.
However, don't expect the others will be following all the manual steps you
invent.

I suggest we gather a meetup to discuss exactly that.
I believe it hurts the community A LOT when people object without providing
better alternatives.
I believe it hurts the community when people try solving **non-existing**
issues (see [3]).
I believe it hurts the community when people keep saying "why change, it
works for me" and ignore all the justifications.
I believe it hurts the community when people object over a triviality.

Please, take it seriously.
I do not really want to see the story of "key committers leaving the
project as it happened in log4j 1.x, Mina, etc"

Do you want to see "latest tested Java at the Calcite website"?
Nice. Why don't you make the version appear automatically on all the pages
that need it?
If you lack time, why don't you ask somebody to volunteer on that?

What is the point of asking people doing manual steps and blaming me and
Rui for
inaccurate following the steps you created?

We followed your suggestions, and we are here: I forgot editing howto, and
Rui missed it as well.
Could you please understand adding manual steps NEVER really solves the
"documentation being stale" issue?

Julian>you were not prepared to accept simple solutions

Your suggestion regarding howto.md is not a solution, so I am not prepared
to accept non-solutions.

I suggested multiple trivial (!) proposals to keep the documentation
**always** up to date.
You can ignore them, yet, please don't create obscure manual procedures for
others to follow.

A solution in [1]:
Replace "Prerequisite is ... Gradle (version 7.2) on your path"
with "Prerequisite is ... Gradle (see version in
gradle/wrapper/gradle-wrapper.properties) on your path"
^^ single line change yielding always up-to-date documentation, and it
works for ALL Calcite versions, including the past ones.

Note: there's no harm in keeping gradle-wrapper.properties in the source
release, so

A solution in [2]:
Replace "Prerequisite is ... Gradle (version 7.2) on your path"
with "Prerequisite is ... Gradle {% gradle_version_from_wrapper_properties
%}"

The value should be replaced automatically when building the site.
It might look slightly better on a website, however, it would be slightly
misleading.
If someone tries building a slightly old Calcite version, they would need
to use
a different Java/Gradle version.

---

In the ideal world, the thing should be buildable automatically, or
there should be a small README that explains which tools do you need to be
installed.

I do not really expect people to use howto.md for building from the source
archive.

[1]
https://issues.apache.org/jira/browse/CALCITE-4575?focusedCommentId=17339990=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17339990
[2] https://lists.apache.org/thread/mdxdosofclqbp9p5osflwtor264757m4
[3]
https://issues.apache.org/jira/browse/CALCITE-4575?focusedCommentId=17340320=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17340320

Vladimir

Re: [DISCUSS] Java 16 and 17 support

>For Gradle upgrades, there’s only one place to modify.

Juilan,
Are you willing to keep the place updated?
If so, there's no need to raise mail threads, please just update it.

Vladimir

Re: [DISCUSS] Java 16 and 17 support

2021-12-21 Thread Vladimir Sitnikov

Julian, I know it might be tough, yet:
1) I would like you to believe me I just forgot howto and the other places
hard-code Java versions.
Of course, "we support Java 17" is a notable change, and it does deserve to
be in the release notes,
however, the typical current process is to "file JIRA for notable changes",
so I did just that, and created
CALCITE-4829 Bump Gradle to 7.2 and test with Java 17 at GitHub Actions.

I agree it is unclear how the release manager could deduce they should add
"Calcite X.Y supports Java 17 now" to the release notes.
However, I am sure that could have happened with ANY change.
The release manager often is not in a position to judge which changes are
notable for the release.

2) I previously suggested we should automate the replacement of "the
maximum supported Java".
I believe the current failure highlights that once again.
We can never expect contributors to remember all the places Java versions
can be mentioned.

-

Julian>then the following tasks need to be done:
Julian>Update the Gradle version in howto.md...

I am afraid it does not solve the root cause, so we should do something
different.
New Java and Gradle versions would appear, and we would keep forgetting to
update all the places.

I suggest we add a variable like max_tested_java_version somewhere
(gradle.properties? somewhere else?)
Then we use the variable for building the website and we would check CI
does test with the given version.

^^^ this suggestion will stop those discussions once and forever.
That is basically the same thing I suggested in CALCITE-4575.

>Does Calcite now support Java 16 and 17?

17 is available, so there's no point in mentioning "Java 16 is supported".

Current text:

>Building from a source distribution
>Prerequisite is Java (JDK 8, 9, 10, 11, 12, 13, 14 or 15) and Gradle
(version 7.2) on your path

I would suggest:

>Building from a source distribution
>Prerequisite is Java 8 to ${max_tested_java_version} and Gradle (version
${gradle_version_from_wrapper}) on your path

Julian>Change the subject of
Julian>https://issues.apache.org/jira/browse/CALCITE-4547 to "Support Java
16
Julian>and 17",

-0.97 for mentioning "support Java 16"
We do not support that version, we do not test with Java 16, and we do not
want to add
Java 16 tests as it would significantly increase test times.
We just hope it would work.

Of course, most likely Calcite would work with Java 16 since Calcite works
with 11 and 17.
However, please, Julian, please, let us stop mentioning that we support
irrelevant versions?
None of Java vendors provide LTS for 16.

Vladimir

Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13

2021-12-18 Thread Vladimir Sitnikov

There are issues, and it does fail quite often. It is really frustrating.

I can disable the test if you think it would be better.

Based on past experience, we almost never revisit disabled tests, and it
would look as if "server" passes tests, however, it corrupts data.

Vladimir

Re: Failing task :linq4j:jandexMain when executing CheckerFramework

2021-12-16 Thread Vladimir Sitnikov

>First some annoying :core:checkstyleTest that interferes with files
generated by Intellij IDEA, with hundreds of error messages similar to the
following:

We exclude generated files from the style checks via the following patterns:
https://github.com/apache/calcite/blob/a6293a37dd4166e6fe69ae0c5ef7a89e6677b6ab/build.gradle.kts#L259-L276

Please add the relevant pattern and file the PR.
However, it is really strange that you get the generated files into
generated/sources/version/generated and javacc/javaCCTest/generated_tests
folders.
I don't understand why IDEA would generate files into those folders.
It does not sound like your ImmutableMyProjectFilterRuleConfig is connected
with javaCCTest.
I would try to understand why do you get files into /javaCCTest/ and
/version/ folders.

A side note: Kotlin does have the official coding style, so there's a
formatter that yields nice results and that is consistent with IDEs.
So moving to Kotlin makes many Checkstyle rules obsolete. No joking.

>Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds
for length 0

It is very likely you are using an outdated version of javac which produced
invalid bytecode.
Please upgrade javac (e.g. to the recent patchset).
Note: GitHub CI uses recent patchet updates, so you might fail to reproduce
the issue.

We do know javac versions 1.8 before u202 are extremely buggy, so we fail
the build for those versions.
It might be time to add the same verification for 11, and require 11.0.5 or
later (in case of 11).

>Is there an easy way to fix the problem?

The easy fix is to bump the javac version.
The funny coincidence is that Kotlin solves this issue as well: null safety
is integrated into the regular compilation,
so there's no need for a separate (and time-consuming!) checker framework
compilation.

Vladimir

Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13

2021-12-15 Thread Vladimir Sitnikov

The failure is known as https://issues.apache.org/jira/browse/CALCITE-4824

Does anybody use calcite-server?
Does anybody have cycles to maintain/fix calcite-server?
Can we just remove it?

Vladimir

Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13

2021-12-14 Thread Vladimir Sitnikov

Julian>We don’t need any more noise on the dev list.

Julian at all, please consider ASF Jenkins failures as true failures rather
than "infrastructure noise".
The messages could probably be improved, however, I have not explored them.

Jenkins won't send mails in case everything is ok.

Jacques>if these are real failures or infra problems

I did very that Calcite-snapshots job can pass.
I did configure the job to remove ALL non-git files before each build,
I configured the job to run with --no-build-cache.

Technically speaking, the "workspace cleanup and --no-build-cache" are not
needed, however,
they imply that failures are likely caused by the bug in Calcite code
rather than "infrastructure".

-

In this case, :server:test failed, see "tests tab":
https://ci-builds.apache.org/blue/organizations/jenkins/Calcite%2FCalcite-snapshots/detail/Calcite-snapshots/13/tests?start=0

It looks like there's a concurrency issue in "server" (Calcite or Avatica)

> Caused by: org.apache.calcite.sql.validate.SqlValidatorException:
Number of INSERT target columns (3) does not equal number of source items
(2)
> at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown
Source)
> at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505)
> at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:599)
> ... 66 more
44,49c191,195
< +---+---+
< | I | J |
< +---+---+
< | 1 | 2 |
< +---+---+
< (1 row)
---
> +---+---+---+
> | I | J | K |
> +---+---+---+
> +---+---+---+

Vladimir

Re: RelNode semantic evolution vs adapter implementations

2021-12-14 Thread Vladimir Sitnikov

>some direct or indirect changes (metadata, rules, validator, etc)
>may cause changes to plans around the given RelNode in the downstream
>project

Do you think changes like "metadata, rules, validator" can cause "wrong
results" issues?
What could be the issues caused by "metadata, rules, validator" changes?

I know ObjectWeb ASM bytecode manipulation library uses the concepts of
versions for quite some time, and it seems to work for them.

As you subclass, say, ClassVisitor, you have to pass the API version which
your visitor supports.
Then ASM fails in the runtime if the visitor happens to be applied with a
too recent bytecode.

>and gets the source code of the test that he includes into his project and
CI

Source code and bytecode parsing could be an alternative option (for other
cases), however, it is not clear how to tell
if the rule in question has been updated to support the new Project/Filter
contract.

Vladimir

Re: RelNode semantic evolution vs adapter implementations

2021-12-13 Thread Vladimir Sitnikov

>practice I fear that
>it will introduce a lot of red tape and false negatives

I am afraid "wrong results and corrupted data in database" is way worse
than "recompiling in case of Calcite update".

>I don't recall
>any feedback (positive or otherwise) about how we describe breaking
>changes

CALCITE-3183 was breaking in 1.21, and it was NOT listed in release notes
at all.
Everybody makes mistakes, so I would avoid using "release notes" as the
only way to convey the breaking changes.
In the ideal world, there should be a compilation failure (or runtime
failure) in case of semantic change.

>Maybe we can improve how we report breaking changes

I'm afraid people would miss the notifications.
They can upgrade across several calcite versions, so they might miss
"one release inside of 1.21...1.28 has a significant breaking change"

If we increased the major version number, then they would anticipate the
change.
However, the thing is that "the change in Project" does not necessarily
break all consumers,
and it breaks only those consumers that inspect Project contents and/or
write new methods, etc.



>CALCITE-3183 Trimming method for Filter

In that case, the new field was added to LogicalFilter, so custom
implementations of Filter nodes (e.g. MongoFilter)
had no chance to notice the change.

>CALCITE-4913 Correlated variables in a select list

In the current PR 2623, the extra field is added to Project.java, however,
old constructor is kept intact, so even classes that extend Project don't
notice the semantics difference.

My takeaway is that:
1) It looks like we should refrain from adding fields to "Logical*" rels
without adding the corresponding fields to base classes.
2) It looks like we want to change the constructor signature, so the user's
code breaks if they extended Project.
As an alternative option is to keep old constructors and mark them
as @Deprecated, however, that deprecation would mean
"you get wrong results unless you support the new constructor argument
somehow".

3) Unfortunately, the planner rules do not care which signature the
constructors have, so the planner rules
won't notice the semantic difference.
So I would suggest we add per-relnode "version" (or "feature set"), so the
planner rule could declare "I know what Filter up to v5 means".

We do not change node semantics often, however, when we do we want to avoid
accidental "wrong results".

Vladimir

Re: Automatic maven snapshot releases

2021-12-13 Thread Vladimir Sitnikov

>Where is -Pasf defined/documented?

https://calcite.apache.org/docs/howto.html#making-a-snapshot

Vladimir

RelNode semantic evolution vs adapter implementations

2021-12-13 Thread Vladimir Sitnikov

Hi,

It turns out Calcite's nodes can change semantics without notification for
the end-users.

Here are the release notes for Calcite 1.21, and it says **nothing** like
"Ensure you handle Filter#variablesSet in case you implement Filter or in
case you transform LogicalFilter in your rules"
https://calcite.apache.org/news/2019/09/11/release-1.21.0/

On top of that, the in-core adapters fail to handle that properly. For
example, MongoFilter discards Filter#variablesSet.

Can we please stop changing the semantics of RelNodes or can we have a
better way to detect the changes in the client code?

What if we add a "version" property to the corresponding RelNodes, and we
increment it every time the semantic changes?
Then client APIs could be coded like "ok, I'm prepared to handle Project
v4, and Filter v5" (e.g. make "version" required when registering a rule),
and there will be a runtime error in case Calcite generates Filter v6 in
runtime.

---

Sample case:
CALCITE-3183 Trimming method for Filter rel uses wrong traitSet
CALCITE-4913 Correlated variables in a select list are not deduplicated

The case there is that "correlation variables" are added to the Logical*
nodes (e.g. LogicalFilter, LogicalProject).
Unfortunately, that change is hard to notice: there's no compilation
failure, and there's no runtime error.

The old client code just discards "correlation variables", and I guess it
would result in wrong results or something like that.
Silent wrong results is a really sad outcome from the database.

CALCITE-4913 / PR#2623 adds Project#variablesSet property as well, and I
guess it would in the same hidden semantic loss.

Vladimir

Re: Automatic maven snapshot releases

2021-12-12 Thread Vladimir Sitnikov

Jacques, which credentials are you going to supply to publish the snapshots
to the ASF repository from GitHub?

While I agree with moving towards GitHub actions, I would like to minimize
the use of ASF credentials in non-ASF environments.

Luckily, ASF infra supplies the relevant credentials for publishing to ASF
Nexus,
so a mere "./gradlew -Pasf publish" at ASF Jenkins is enough to publish the
snapshots.

Stamatis>In the past we used to have Jenkins jobs doing this

The job was there, so we don't really need to discuss or vote on that.
I've created the corresponding Jenkins job here:
https://ci-builds.apache.org/job/Calcite/job/Calcite-snapshots/

It pulls GitHub every 5 minutes.
I am not sure if GitHub webhook integration would trigger the build (it
might need a config from INFRA), however, it is possible as well.

Vladimir

Re: Enable "squash and merge" button in GitHub UI

2021-12-06 Thread Vladimir Sitnikov

I wonder if we can add GitHub action that would react to committers comment
/squash by squashing the commits and pushing --force-with-lease to the fork
branch.

If pushing to the fork is not possible in GHA, then something like /merge
that squashes, rebases and pushes to apache/calcite would be good as well.

That is slightly less convenient than a UI button, however, it would be way
better than spending time on trivial squashes.

I think that approach would keep the author/committer info, and it would be
easily accessible


Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-12-06 Thread Vladimir Sitnikov

Moving the reply to DISCUSS thread.

Ruben>My vote might be biased because I have used Jira for many years (and
I have
Ruben>never used Github for issue tracking).

It is really sad to hear.
You authored 87 PRs for Calcite!
You did use GitHub for tracking already.
https://github.com/apache/calcite/pulls?q=is%3Apr+author%3Arubenada+is%3Aclosed

Here's a good example: https://github.com/apache/calcite/pull/2329
PR description says the intention of the feature.
There are overall and line-targeted comments.
Creating GitHub Issue is not that different from creating a PR.

Ruben>I have the impression that some of the problems described discussion
are
Ruben>not per se Jira-related, but they appear because we misuse the current
Ruben>tools (and it would remain more or less the same if we switched from
Jira
Ruben>to GitHub Issues).

Ruben, I claim that we split information for no reason.
What is the real meaning behind having 50% of the review comments in JIRA,
and 50% of the comments in PR?
What is the real meaning behind putting "common description in JIRA, and
putting NO description in the PR"?
Does it help to have different markup languages for JIRA and GitHub?

My main motivation is not "replace JIRA with Issues". That replacement
alone has few benefits.
The key improvement to the current process is creating PRs **with
description**, and keeping the relevant discussion in the PR itself.
I really see no point in PR comments like "let's continue in JIRA".
If a comment targets a specific PR, then why don't we just use PR's comment
fields for that?

Ruben>high level discussion (e.g. feature design) should
Ruben>happen in Jira

The vast majority of the changes are small. Adding Jira barrier for every
change is a productivity killer (for both contributors and for the
reviewers).
I don't mean just "switch between JIRA and PR".
I mean that comments are duplicated at different sites.
The description of the issue and the code changes are separated.
The notifications are duplicated (gitbox duplicates GitHub notifications).
The markup format is different.
The search is not coherent (there's no way to find issues and PRs at the
same time).
GitHub blame does not show the full picture (it shows PRs, however, opening
the JIRA is a separate step).
PR (and GitHub Issue) can be associated to a version, so it would be much
easier to relate the PR with its release version.
I think the list can go and go.

Ruben>IMO every change must have a dedicated Jira ticket

What are the benefits of "a dedicated Jira ticket"?
How "a dedicated Jira ticket" is different from "a dedicated PR"?

Ruben>high level discussion (e.g. feature design) should
Ruben>happen in Jira, low-level details (e.g. line by line code review) in
the
Ruben>PR;

Both can be in PR, so what are the benefits from splitting "low-level" vs
"high-level"?
There is a lot of cases when people discuss low-level details in JIRA.

It would be so much better, if we aligned the review approach.
Jira and GitHub are just tools, and if we don't agree to review high-level
stuff first, then no tool helps.
That implies, GitHub is just fine for high-level reviews.

Ruben>like customized dashboards, or linked issues ("issue A is blocked by /
Ruben>caused by / duplicated by / ... issue B").

GitHub allows links, and links have no properties like "caused by",
"duplicated by".
However, basic links are just enough for Calcite.

If you have a "customized dashboard" that is useful for many contributors,
please show it.
The ASF is a set of volunteers, so it is not right to enforce usage of JIRA
when only a single person needs "a customized dashboard".

Vladimir

Re: [VOTE] Move to GitHub PR-centric development workflow, phase out JIRA

2021-12-06 Thread Vladimir Sitnikov

>I think JIRA works well.

I would appreciate it if you could clarify.
What I suggest would work better than JIRA in virtually all the cases. So
why stick with JIRA?

Vladimir

[VOTE] Move to GitHub PR-centric development workflow, phase out JIRA

2021-12-03 Thread Vladimir Sitnikov

Hi,

Let us have a vote for PR-centric workflow for Calcite.

[ ] +1, let's move to GitHub PRs/Issues for change management.
[ ] -1, keep using JIRA because, 

Here's my vote:
+1 (binding), let's move to GitHub PRs/Issues for change management.

Note: dev@calcite keeps as is. I do not suggest replacing it.

I suggest the adjustment of development workflow would reduce maintenance
costs,
make contributions easier, and make reviews easier.

Note: please don't consider this as "replacing JIRA with GitHub issues".
I understand that a mere replacement of system X with system Y often brings
nothing.

See the relevant discussion thread:
https://lists.apache.org/thread/m2h13v2p7kowglj73qr4sqn1c3pm8tlq

Current state


We already allow committing changes to Git without PR, and without JIRA
ticket.
We already allow submitting PRs without JIRA.
I do not mean we encourage that, however, there's no strict rule like
"every commit must have JIRA".

The pattern I see is:
* contributors spend time figuring out if there's a bug in their system or
if there's a bug in Calcite
* then they create JIRA with all the details
* they submit a PR with the relevant change right away
* they copy the same-looking information multiple times: git commit, JIRA
description, PR description (it is often left blank though)

The discussion often separates between JIRA and PR.
PR is super-convenient for line-based comments,
on the other hand, "bigger comments" are put in JIRA (for unknown to me
reasons).
It is extremely sad when JIRA comments refer to specific bits of the code.

We even have a label "discussion-in-jira" that means "there's a discussion
JIRA, deal with it somehow".

Suggestion


Make PR-first a common approach for most of the changes.

Small changes like adding tests, fixing documentation
could come as pull requests without creating the corresponding JIRA ticket.
https://github.com/apache/calcite/pull/2628 (Add test for 'a IS NOT NULL...)
All the comments from https://issues.apache.org/jira/browse/CALCITE-4917
could be put to the PR2628 itself. There's no need for a separate "ticket"
or "issue"
since PR itself could explain the reason for the change quite well.

I suggest we put "high level" comments into the PR itself.

The same works for bigger features.
A bigger feature can appear as a draft PR, then we shape it in comments,
and merge it.
A complex feature can appear as a GitHub issue that references other issues
or PRs.

Sometimes we do not have a solution, so we might create GitHub Issue or
send dev@calcite mail for that.

In practice, the suggestion does not change much, however,
I suggest removing the duplication of work (no need to copy the information
across PRs and JIRAs),
it makes merge simpler (no need to update JIRA),
it makes searching issues easier (there would be a single field to search
for issues).


Release management


Both GitHub PR and GitHub Issue can be associated with a milestone
(~version),
so we could use it to track which release includes what.

Issue history


We could keep old tickets in JIRA and use GitHub for new issues only.
That is the most trivial approach, Airflow team did that, and we might be
ok with that.

An alternative option is to copy issues (including resolved ones).
I guess attachments would be hard to clone, however, copying issues and
comments should not be a problem.

Other


I might miss certain bits, however, I suggest figuring out the solutions
later.
I am sure GitHub covers all the cases we need, and we have dev@calcite as a
fallback.

Vladimir

Re: Enable "squash and merge" button in GitHub UI

We have many committers, and it might make sense to keep track of who
commits what.

Squash and merge behaviour changed in 2020 :(


https://github.community/t/authorship-of-merge-commits-made-by-github-apps-changed/2971/33

Vladimir

Re: Enable "squash and merge" button in GitHub UI

>I've never experienced this with other (non-Apache) projects.

https://github.com/JetBrains/kotlin-wrappers/pull/969

git show --format=full a3ff898f4f4e36d5a8bba9c1cbbefa924a9f5386

=>

commit a3ff898f4f4e36d5a8bba9c1cbbefa924a9f5386
Author: Victor Turansky 
Commit: GitHub 

:(

Vladimir

Re: Enable "squash and merge" button in GitHub UI

Thanks for the pointers. The committer info is lost indeed.
I would add the comments to asf.yaml so we can re-iterate later.

Vladimir

Re: [DISCUSS] Writing rules in dependent projects without depending on Immutables

>Has anyone else run into this problem?

I've no idea how hard it would be to solve, however, it looks like
something worth doing.
It would be sad to require everyone to add annotation processors to their
builds.

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

Julian, it looks like you have not been using GitHub for the past 5 years,
and you miss a lot of improvements.
GitHub allows a superset of what we do in JIRA, so I really can't
understand why do you suggest enforcing JIRA.

None of what you say sounds like a "good point to keep using JIRA", and
none of what you say sounds like "a pitfall of using GitHub".

>As to the backlog of PRs

I could have just merged 2628, however:
1) We spend time on discussions in JIRA
2) After PR merge, someone would have to "resolve the JIRA"

That consumes the reviewer's time.

Julian>GitHub also makes it easy to write code, submit it as a PR, and that
PR becomes the issue.

That is a great GitHub feature. I do not see how do you treat it as
something bad.
It really makes no sense to create JIRA for adding tests, updating
documentation.

Julian>Lastly, another quibble about GitHub: that it allows people to
squash, rebase and amend commits. The line-by-line comments quickly go out
of date.

squash, rebase, and amendments make JIRA comments go out of date at the
same rate.
However, in GitHub there's an explicit "resolve discussion", "reopen
discussion" features,
so there's a clear way to tell when the discussion is closed.

JIRA has no conversation threads. GitHub has threads (code-related).

Julian>GitHub makes it easy to review line by line, but you can then miss
the important stuff.

Do you mean we should stop code reviews?
If we review code, then using both "JIRA and GitHub" makes it way easier to
miss stuff.

However, I agree we should review high-level ideas first, and put
line-by-line comments
only when high-level direction is OK.

GitHub UI does have regular comments that are not attached to the diff
lines,
so you can comment there just like you add JIRA comments.

Julian>And I like making holistic reviews

What stops you from adding an overall review in GitHub?

Julian>Then discussion can happen before coding starts.

https://github.com/apache/calcite/pull/2628 is a perfect example here.
The PR is trivial, and it adds a couple of tests.

The PR looks good to me, except for method names.
However, I truly do not know how to make test names better in Java, so I
just ignore it.
In Kotlin, one could use method name with spaces, so it would be better,
however, I just ignored it.

Then Julian comments "The test case names are not very useful".
This should really belong to line-by-line diff rather than JIRA ticket.
Now everyone should navigate between JIRA and PR to figure out the meaning.

I see Julian suggests adding all the tests into a single method,
however, I think we should just merge the thing and get over it.

I am sure the discussion in CALCITE-4917 is not worth the time we spend
there.
We are discussing 2 lines of code there. Exactly two lines of test code.
It would be OK to discuss method vs class when it comes to 100 lines.

Even adding tests to JdbcTest does not have those levels of discussions.

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Steven>Look at the examples in the wikipedia page you link to

Let me please copy Wikipedia for you.
I agree it is not that easy to spot, however, please pay attention to
Car#accept method.

class Car {
...
@Override
public void accept(CarElementVisitor visitor) {
for (CarElement element : elements) {
element.accept(visitor);
}
visitor.visit(this);
}
}

The notable difference is that Wikipedia suggests that Car is in charge of
enumerating Car elements.
Of course, trivial car elements just call visitor.visit(this).
However, compound elements should call the visitor and do re-composition or
whatever.

CarElementPrintVisitor in Wikipedia does **not** enumerate Car's elements.
There's no CarElementPrintVisitor#visitElements

In other words, if we want Visitor Pattern, we need to move
RelShuttleImpl#visitChildren(RelNode rel)
into the **default** implementation of RelNode#accept(RelShuttle)

However, as I said, if we consider moving towards the proper visitor
pattern, we might consider
the approach listed in "From Object Algebras to Finally Tagless
Interpreters" article above.

Vladimir

Re: Enable "squash and merge" button in GitHub UI

I wonder if that has already been fixed.

What do you think if I temporary enable the button, try it with
https://github.com/apache/calcite/pull/2628 and then either keep the button
or add a comment to asf.yaml?

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Steven>Agree with James, and that's not even including implementations in
other
Steven>codebases that use calcite (e.g. there are dozens of implementations
in
Steven>Dremio's codebase).

Can you please show RelShuttle in Dremio codebase?
I see no RelShuttle implementations:
https://github.com/dremio/dremio-oss/search?q=relshuttle

Note: RelShuttleImpl does not count since RelShuttleImpl does not really
need RelNode cooperation.
RelShuttleImpl just enumerates children of the relations, and it does not
rely on accept(..) calls.

It can do the very same thing without calling accept(this), so we can
remove  RelNode#accept(RelNode) methods,
and we can still get the same RelShuttleImpl.
It can have a series of if (... instanceof) checks or a Map#get(getClass())
or even https://github.com/forax/exotic#visitor---javadoc

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Steven, James,

>there are 22 implementations of RelShuttleImpl(excluding tests)

14 of them are RelHomogeneousShuttle subclasses.
RelHomogeneousShuttle redirects all customized visit methods into a single
visit(RelNode) method.

That means Calcite has only 6 usages (!) of the "visitor" feature, and the
feature basically forces ALL RelNodes
implement an obscure RelNode#accept.

The method is ill-defined.
Here's its javadoc: "Accepts a visit from a shuttle"
It makes no sense at all. How should I implement this?

@return A copy of this node incorporating changes made by the shuttle to
   * this node's children

None of Calcite's RelNode implementations walk over children or something
when implementing accept(RelShuttle).



The only non-test, non-homogeneous subclasses are:

ToLogicalConverter
RelNodesExprsHandler
SqlToRelConverter#convertSelectImpl # "attach hint to the first Hintable
rel"
RelOptMaterialization#tryUseStar
ViewTable#expandView

I do not know if those classes require any of the "visitor" features,
however,
it is clear that the half-visitor is complicating things more often than it
helps.

Vladimir

Enable "squash and merge" button in GitHub UI

Hi,

I see "squash and merge" is not enabled in GitHub.
I am going to enable it soon, so we can merge PRs as single-commit changes,
and we no longer ask
contributors to "squash changes" manually.
I've no idea why the button was disabled.

See https://github.com/apache/calcite/pull/2629

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

I've no issues with making the existing visitor more generic, however,
the choice of accept methods seems arbitrary :-/

Would it help if we move or prefer using algebras rather than visitors?
https://oleksandrmanzyuk.wordpress.com/2014/06/18/from-object-algebras-to-finally-tagless-interpreters-2/

Just to clarify:
1) I would like to refrain from breaking compatibility, especially if we
add an abstract method that is used in 1% of Calcite calls.
If you still want to break backward compatibility, it would be great if you
could show at least one use case for the change.
Otherwise, it would end up a breakage without even having a use case for it
:-(
No kidding. I think RelShuttle is not really useful, and we could just drop
all RelNode#accept methods and lose virtually no features.
Adding generics on top of RelShuttle is not going to heal that.

2) The current RelShuttle does not seem to add much value.
In other words, all implementations of accept(RelShuttle) method are
trivial.
The example from Wikipedia suggests:
https://en.wikipedia.org/wiki/Visitor_pattern#What_solution_does_the_Visitor_design_pattern_describe?

Wikipedia>Define a separate (visitor) object that implements an operation
to be performed on elements of an object structure
Wikipedia>*Clients* *traverse *the object structure and *call a dispatching
operation accept* (visitor) on an element

I think the naming or the implementation in Calcite is wrong.
We should either stop naming such things as RelShuttle and RexVisitor
"visitors" or we should incline to the canonical approach.

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-12-01 Thread Vladimir Sitnikov

>A bigger problem around our process that I see is not enough reviews/triage
for the contributions we do receive. We have 224 open prs and I'd estimate
a median open time of 1 year+

I am sure JIRA is slowing us down here.

If we accept we do not need JIRA, PRs could become the way to go for the
reviews.

Jacques, I see you've just created
https://github.com/apache/calcite/pull/2625 (generic visitor), and you
followed the exact pattern I suggest: you filed a PR without JIRA, and you
put all the clarifications into the description.
Congratulations. It confirms JIRA adds no value to our workflow.

Your https://github.com/apache/calcite/pull/2624 (introduce
handlerbased...) is less fun. Why did you create JIRA? ;)
It could have been a JIRA-less PR as well.

As you put description to the PR itself, it becomes easier to review.

That is what I mean.



We rarely reject contribution or ideas, and, I believe, we could reduce
time-to-merge by lowering the barrier and allowing PR-first contributions.

Let me put it as follows: in many cases committers and experienced
contributors know they are doing good.
For example, I see you are moving towards AOT. That is fine, and I do not
really need all those JIRA's.
It would be so much easier to review and follow if you created a single
Issue that references all your AOT PRs.

It is not really fun to create tickets for moving to JUnit5. We could just
do it instead of spending time on JIRA comments.



>I'd estimate a median open time of 1 year+

I suggested moving "core test framework classes" from core/src/test/ into
its own test module long ago. My mistake was I sent an email instead of
just sending a PR.

The response to my mail was strongly negative, however, recently I did the
split, and it works great.

>This seems to solve a problem that doesn't exist

It does exist. My current mail highlights it right away: you could submit
JIRA-less PRs, everybody would have a single page for reviews, and so on.
It is really sad to hear "does not exist".

JIRA causes excessive mails: "jira created", "pr created".

In case we skip JIRA completely, it would be less notifications for
everybody.

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-12-01 Thread Vladimir Sitnikov

Stamatis>Quite interesting information so far

One more data point: LLVM is migrating issues from their Bugzilla to GitHub
Issues right now (52K issues!)
See https://llvm.discourse.group/t/bugzilla-migration/4848

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-11-30 Thread Vladimir Sitnikov

>I know people/companies who have built tools/plugins
>around JIRA for getting useful information such as which internal forks
>have this particular fix, etc

That is true. However, I do not know what should we do about it.
Of course, any change brings hidden costs, so if the benefits are unclear,
there's not much point in making changes.

On the other hand, the more I discover, the more I am sure GitHub Issues
would be great for Calcite.
I started the discussion like "it looks we are fine, but we can try
GitHub", and now I am quite
enthusiastic that moving to GitHub would bring multiple improvements in
various aspects.

>I turn to JIRA
>or GitHub and it turns out that I am able to find the information easier
>with JIRA.

GitHub issues are searchable, including comments.
We currently "enforce" JIRA tickets, so PRs lack descriptions and
explanations, so it is expected you can't find them.

Try using JIRA to figure out when we bumped Guava to 31. There's no JIRA
for that.

On the other hand, GitHub search returns PRs, code files, and commits that
match the search.
https://github.com/apache/calcite/search?q=guava+31

Then, if you know Git SHA (e.g. from "git blame"), you can find PRs related
to that SHA.
For example
https://github.com/apache/calcite/search?q=890eb61ef486e2192110cefe4cac5aa6f150=issues
Frankly speaking, I never thought of trying to search via commit SHA, but
it is a definitely cool trick I learned
when reading
https://docs.github.com/en/search-github/searching-on-github/searching-issues-and-pull-requests

When it comes to versions, GitHub PRs and issues would have "milestones" (~
"fix version"),
and when you browse commits, GitHub shows the list of tags that include
this commit.
For instance, if you open
https://github.com/apache/calcite/commit/2326cf8dc97f473d0efa2b0e4883fef87a40746b
,
you know Guava was updated to 31 since Calcite 1.28.0

>One unpleasant situation which comes up from time to time is when
>discussions develop in both places. However, migrating to GitHub issues
>will not completely solve this since we will still need to navigate back
>and forth between issue and PR

If we allow PR-first (no issue, no JIRA, just PR with the proper
description) contributions, then:
1) It will be in a single place, no need to switch
2) Navigation between issues and comments would still be easier in GitHub
than commenting like "please continue in JIRA"

GitHub allows responding to issues/PRs via mail, and I find it a nice
feature.
I don't use it often, however, I use it from time to time, and I see others
using it.

>Moreover, many Apache projects are using
>JIRA already which makes it easy to perform cross project search

GitHub has an out-of-the-box way to scope search for organizations.

Here's a search for Calcite in the Apache organization:
https://github.com/search?q=org%3Aapache+calcite

At any point in time, you can remove org:apache, and you get GitHub-wide
search.
GitHub hosts more projects than ASF JIRA, so being able to search and link
across them (all, not just ASF projects) would be useful.

>I find JIRA search superior to GitHub both when it comes to approximate
>text search and exact filters

I might be wrong here, however, I think we do not really use JIRA metadata.
In most cases, we use only the ticket title, description, and the fix
version.

There's "affects version" in JIRA, and I do not know if GitHub has a
one-to-one replacement for that.
However, I would be fine if the affected version was in the issue/PR
description.

"component" (~core, linq4j, cassandra, ...) could become a label in GitHub.
AFAIK the limitation is that labels can only be changed by committers or
collaborators.
We can mitigate that by:
a) Adding collaborators (via asf.yaml)
b) Adding committers
c) Adding GitHub Actions that label Issues based on descriptions or even
PRs based on the set of touched files.

GitHub's search features are more than enough for us. We have no fancy
metadata for complicated searches :)

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-11-30 Thread Vladimir Sitnikov

>I think the easiest thing is to focus on the
>"workflow" for each "persona" we have

While I agree the questions you raise are valid, I would say, that moving
to GitHub Issues
does not really change the workflow except in certain small cases.

I doubt there's an exhaustive list of answers for JIRA.
No way I want to bomb you with answers, however, I truly believe replacing
JIRA with GitHub is not really disrupting for Calcite.

On the other hand, moving to GitHub issues would likely bring new
contributors and reviewers.

>Writing down what you believe to be "allowed" and making sure everyone
>else is on the same page is perfect.

https://calcite.apache.org/develop/#contributing is more or less relevant
if you replace JIRA with GitHub Issue.

>what would a contributor
>do on Github to provide a patch for a bug they found?

Like with the current workflow: they file a PR.
They don't have to file a JIRA first. They just file a PR that explains the
problem and suggests the fix.

>What about a contributor reporting an issue?

a) If they are not sure, they can file an issue at GitHub
b) They can ask on a dev@calcite mailing list
c) If they can't understand how to use Calcite APIs, they could file GitHub
Discussion (if we enable it).
It might be useful to distinguish "bugs in Calcite" vs "user
questions regarding the way to use Calcite".
d) If they have a reproducer, they could just file a PR that reproduces the
issue. It would immediately
show up as a build failure, and everybody would be able to see how it
breaks.

>How should a developer structure a "big"
>change to Calcite which might contain multiple commits?

a) They can file PR that contains several commits (like we currently do
sometimes)
b) They can create an issue and reference the relevant PRs/issues in the
description

"tasks lists" is useful here:
https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists

^^ the above answers are pretty much the same as our current workflow
except JIRA is replaced with GitHub issues.

>Where does a
>release manager look to determine if a release is ready (all necessary
>issues are fixed)?

Typically, when planning a release, we ask everybody to figure out which
issues are "blockers" for the release.
With GitHub, we can associate the relevant issues/PRs with the milestone.

Here's how "open items" look in Airflow:
https://github.com/apache/airflow/milestones/Airflow%202.2.3
The release manager opens the milestone, and they see if it is all ready.

Vladimir

Re: calcite-core 1.28.0 now depends on kotlin-stdlib?

>since core/src/kotlin does not exist yet.

I mean core/src/main/kotlin does not exist yet

Vladimir

Re: calcite-core 1.28.0 now depends on kotlin-stdlib?

If it is a transitive dependency, then it is a transitive dependency.
Otherwise it is a bug.
I'm sure calcite-core itself does not use kotlin stdlib since
core/src/kotlin does not exist yet.

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

>All that said, this was a number of years ago

I agree. GitHub makes noticeable progress nowadays.

>I'd want to see that demonstrated before
>changing all of Calcite over.

How do you want that demonstrated?
Apache Airflow does use GitHub Issues, they do ship releases.

Many top-starred Apache repositories use GitHub issues:
https://github.com/orgs/apache/repositories?qstargazers

Do you think Calcite has something unique in the release process that makes
GitHub infeasible?

Apparently, issues and PRs can be assigned to a milestone.
Apparently, committers can close the milestone.

>I'm a little confused in your original message, you have two questions.
>Are you suggesting that Calcite would use both Jira and Github issues?

No way. I think we should move all issue management to GitHub.
I just thought that questions like "should we keep JIRA" or "should we move
to GitHub" might
provoke slightly different discussions with helpful comments even though it
is basically the same question.

>Should PRs always have issues or is a PR sufficient
>for inclusion in a release?

I believe, the current development workflow allows:
* PR without JIRA
* commits without PR
* commits without PR and JIRA (just commits)

It is up to the committer if the change is significant enough to have JIRA
or PR or whatever.

>GH issues created a black hole where anyone could create issues and
>they could just get "lost"

Nothing prevents the issue from being "lost" in Calcite JIRA :-/

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

>I haven't found navigating JIRA + GitHub PRs to be a significant
>problem, but if others think it would help

GitHub has a single search field, so you can search for both PRs and issues
at the same time which is nice as well.

>will my GH comments end up as coming from @michaelmior

If we migrate issues, then they will be created by a "bot" account.
Of course, we can add comment author in the comment itself.
Here's how Coq migrated their issues: https://github.com/coq/coq/issues/2268

We can figure out which accounts do we want to use (e.g. create email ->
account mapping file).

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

Here's Airflow vote thread (~2020):
https://lists.apache.org/thread/s5nk0mgblkzq9jrqgrvp12yn4kmy0o7b
here's the discussion:
https://lists.apache.org/thread/6lfth3o4gzk3s891bcqy045z5gl4fs07

I believe draft PRs could avoid "DISCUS" threads that suck time from
everybody and those DISCUSS threads add nothing to the code :-/
For instance, Julian has recently launched "Refactoring tests".
I think it would be just fine if Julian pushed a draft of the intended
changes, we just added +1 and went forward.

Just to clarify: the whole idea of moving to GitHub issues for Calcite
started when
Apache JMeter team started a discussion regarding moving from Bugzilla to
GitHub Issues.

Currently, I think that JIRA for Calcite is like learned helplessness.
Everybody knows "JIRA is required", however, nobody questions that.

I do not say JIRA is bad. I just say 99% of code changes happen in GitHub,
so unifying issues and PRs would be a big plus.



Airflow team mentions that having a JIRA issue for each PR adds unnecessary
overhead.
I believe in many cases we could be just fine if we have everything right
in the PR itself.

For example, do we really need to create JIRA issues like "[CALCITE-4836]
Upgrade protobuf-java 3.6.1 -> 3.17.1"?
I believe filing a PR is just enough

In many cases, having a proper comment in PR helps *A LOT* for the reviewer.
https://github.com/apache/calcite/pull/2615 is a perfect example.
PR 2615 does have a description that clarifies the reason for the change.
It is just fine without JIRA.

In many cases, I tend to copy descriptions between JIRA and PR because it
does help for the review:
it is much faster to switch between "conversation" and "files changed" tabs
than to load the corresponding JIRA.
Here's an example: https://github.com/apache/calcite/pull/2410

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

>Mark the fix versions of each JIRA issue. It is visible for users to
>find which version contain this functionality or fix this bug.

GitHub has "milestones".
See how it is used in Airflow:
https://github.com/apache/airflow/milestone/45?closed=1

>2. Watch the interested issue so I could receive email once this issue
>has been changed.

With GitHub you can subscribe to the issues or PRs you like, and you get
the notifications.

>3. Relate the current JIRA issue with it's related JIRA.

Here's an example: https://github.com/apache/airflow/issues/19788

Note how the following appears in the #19788 (so cross-links like
Issue-Issue, Issue-PR, PR-PR, and so on work)  issues:
>uranusjr mentioned this issue 6 days ago
>Adjust built-in base_aws methods to avoid Deprecation warnings #19725

Here's how the mention look like:
https://github.com/apache/airflow/pull/19725#issuecomment-977078128

Note: GitHub has a beta program for Projects at the moment which would
improve
issue management way further:
https://docs.github.com/en/issues/trying-out-the-new-projects-experience/about-projects

Vladimir

Re: [DISCUSS] JIRA vs GitHub Issues

2021-11-28 Thread Vladimir Sitnikov

We can copy all the issues, including resolved ones.

We can import issues. GitHub has api for importing without notifications.
Infra team can execute such a script (we need to find it)

Vladimir

[DISCUSS] JIRA vs GitHub Issues

2021-11-28 Thread Vladimir Sitnikov

Hi,

What do you think of moving to GitHub Issues for Calcite?

Currently, my (Calcite) development workflow focuses on pull requests.
That is all contributions I see come via pull requests.

At the same time, issues are hosted in JIRA, which creates friction: PR
merging requires changes to both GitHub and JIRA.
I guess it would be easier if issue management was in GitHub as well.

Then, I believe, co-locating issues and PRs would make external
contributions easier.
The links between Issues and PRs would be easier to navigate.

There's a possibility to enable GitHub discussions as well (see
https://github.com/apache/airflow/discussions ).
Co-locating Issues, and Discussions look promising.

I think it is not that painful, however, GitHub has a limitation of 20
collaborators (non-committers who want to assign, edit, close issues, and
pull requests):
https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-AssigningexternalcollaboratorswiththetriageroleonGitHub

I have no strong opinion, however, I just realized having issues in GitHub
would ease friction.

Any thoughts?

Just to gather opinion:
-1..+1 keep using ASF JIRA for issue management
-1..+1 migrate to GitHub Issues

Here's my vote:
+0.5 keep using ASF JIRA for issue management. It more-or-less works,
however, merging PRs and navigating from PR to issue is far from perfect.
+1 migrate to GitHub Issues. It would simplify navigation between issues
and PRs, and I believe it would reduce friction for external contributors.
In theory, GitHub Issues can be automated with Actions (e.g. labels). I'm
not sure if we need that, however, it might be useful.

Vladimir

Re: [DISCUSS] Apache Calcite Online Meetup January 2022

2021-11-18 Thread Vladimir Sitnikov

>Are the people willing to give a talk around Calcite?

I may present a talk regarding Gradle, and the upcoming migration to the
idiomatic structure:
https://issues.apache.org/jira/browse/CALCITE-4832
https://github.com/apache/calcite/pull/2566
https://docs.gradle.org/7.3/userguide/structuring_software_products.html

>What do people think in terms of timing?

I think the new build structure is more like 30-45 min talk

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

2021-11-17 Thread Vladimir Sitnikov

Jacques>Based on this, I would suggest we do a small
Jacques>modification around CI on Calcite whereby pre-commit tests do not
include
Jacques>running against Avatica main and instead we have a post-merge CI
job that
Jacques>monitors the status of connecting the two

Suppose PR verification passes.
Suppose I merge the PR.
Then "post-merge CI job fails".
Who resolves the issue?

I'm strongly against moving the burden of such maintenance to committers.
The newly added code should not contain known bugs.

Julian>CI of Calcite master against Avatica master SHOULD pass but not MUST

The issue is that people rarely watch for the status of "allowed failures".
The PR verification either succeeds or fails.
If it succeeds, then you are done.
No one would and no one should open "non-mandatory verifications" each and
every time.

It is like @Disabled tests. Once the test gets ignored, people rarely
revisit it,
even if they promise to heal it in a couple of weeks (see CALCITE-3457).

So the earlier we detect the incompatibility, the easier it is to fix.

Jacques>My high level observations

The survey is biased, so are the outcomes.

There's a question regarding "separate release cycles".
How that question contributes to "should we merge ..."?

Why don't you ask the other questions raised in the thread?

For instance:
* How many Avatica versions should Calcite support?"
* Have you ever used source release artifacts (*.tar.gz) outside of
"release voting"? "never", "Calcite", "Avatica", "other ASF projects"
* Would you prefer voting and testing on Calcite and Avatica separately or
would you prefer to have a single release candidate?
* Is it important to be able to commit Avatica changes while Calcite is
being released?
* Would you prefer having different release processes for Avatica-java and
Calcite, or would you prefer to have a single sequence to release both of
them?
* Is it important to be able to open both Avatica and Calcite in a single
IDE for making changes and running tests? "nice to have", "don't care",
"not important"
* Is it important to be able to file PRs touching Avatica and Calcite at
the same time? "nice to have", "don't care"
* Is it important to have the same CI and build scripts among Avatica and
Calcite?: "same scripts please", "don't care", "different build scripts (or
even build systems) are fine"
* Is aligning test framework code between Avatica and Calcite (e.g. junit,
hamcrest versions) important? "nice to have", "don't care"
* Is it important to have Avatica and Calcite as separate artifacts at
dist.apache.org? "nice to have", "don't care"
* Is it important to be able to separate issue and PR notifications for
Avatica and Calcite?

And so on.

Vladimir

Re: [DISCUSS] Refactoring tests

2021-11-16 Thread Vladimir Sitnikov

Jacques>This sounds like it will mean we will need to make calcite-core
test artifacts available

Test artifacts publication is yet another anti-pattern just like "base test
class".
This change has been discussed:
https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1

If you want to depend on a class from tests, consider moving it to /testkit
module:
https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit

Jacques>We should think about the rules around Kotlin

What happens in calcite-core/tests stays in calcite-core/tests :)

It is reasonable to assume that testkit module would have dependencies,
and testkit would provide API that is usable from Java and other JVM
languages.

In that regard, Kotlin dependency in testkit is not much different from
Quidem or commons-lang3.
Consumers might use Quidem if it fits just like they could use Kotlin if it
fits.

Vladimir

Re: [DISCUSS] Refactoring tests

2021-11-16 Thread Vladimir Sitnikov

Julian,

Thanks for taking time, and working on this.

"Fixture" is a too broad term, and I do not mean we should replace it
completely with AssertJ.

What I mean is that AssertJ allows creating *discoverable* assertions.
Hamcrest assertions are hard to create and hard to discover in IDE.

---

Re test code implementations, we junit5 extensions would likely be useful.
For instance


@ExtendWith(CalciteParserExtensions.class)
@WithLex(BACK_TICKS)
class MyParserTest {
   void invalidSqlFails(SqlParserFixture sql) { // <-- injected by extension
.

In other words, it does not mean fixture (whatever it means) must be 100%
junit5.
What I mean, junit5 extensions could simplify writting calcite tests by
making fixture confuguration declarative, and injecting the configured
object as test method argument.

Vladimir

[jira] [Created] (CALCITE-4889) Double join is created for NOT IN

2021-11-16 Thread Vladimir Sitnikov (Jira)

Vladimir Sitnikov created CALCITE-4889:
--

 Summary: Double join is created for NOT IN
 Key: CALCITE-4889
 URL: https://issues.apache.org/jira/browse/CALCITE-4889
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.28.0
Reporter: Vladimir Sitnikov


The following queries yield several joins in the plan.
I think double joins are excessive here, especially for the first case where 
all the values are non-nullable.

{code}select * from "scott".emp where (empno, deptno) not in ((7369, 20), 
(7499, 30));{code}

{noformat}
select * from "scott".emp where empno not in (null, 7782);

EnumerableCalc(expr#0..12=[{inputs}], expr#13=[0:BIGINT], expr#14=[=($t8, 
$t13)], expr#15=[IS NULL($t12)], expr#16=[>=($t9, $t8)], expr#17=[AND($t15, 
$t16)], expr#18=[OR($t14, $t17)], proj#0..7=[{exprs}], $condition=[$t18])
  EnumerableMergeJoin(condition=[=($10, $11)], joinType=[left])
EnumerableSort(sort0=[$10], dir0=[ASC])
  EnumerableCalc(expr#0..9=[{inputs}], proj#0..9=[{exprs}], EMPNO0=[$t0])
EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
  EnumerableTableScan(table=[[scott, EMP]])
  EnumerableAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)])
EnumerableValues(tuples=[[{ null }, { 7782 }]])
EnumerableSort(sort0=[$0], dir0=[ASC])
  EnumerableCalc(expr#0=[{inputs}], expr#1=[true], proj#0..1=[{exprs}])
EnumerableValues(tuples=[[{ null }, { 7782 }]])
{noformat}

{noformat}
select * from "scott".emp where (empno, deptno) not in ((1, 2), (3, null));

EnumerableCalc(expr#0..14=[{inputs}], expr#15=[0:BIGINT], expr#16=[=($t8, 
$t15)], expr#17=[IS NULL($t14)], expr#18=[>=($t9, $t8)], expr#19=[IS NOT 
NULL($t11)], expr#20=[AND($t17, $t18, $t19)], expr#21=[OR($t16, $t20)], 
proj#0..7=[{exprs}], $condition=[$t21])
  EnumerableMergeJoin(condition=[AND(=($10, $12), =($11, $13))], 
joinType=[left])
EnumerableSort(sort0=[$10], sort1=[$11], dir0=[ASC], dir1=[ASC])
  EnumerableCalc(expr#0..9=[{inputs}], proj#0..9=[{exprs}], EMPNO0=[$t0], 
DEPTNO0=[$t7])
EnumerableNestedLoopJoin(condition=[true], joinType=[inner])
  EnumerableTableScan(table=[[scott, EMP]])
  EnumerableAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0, 
$1)])
EnumerableValues(tuples=[[{ 1, 2 }, { 3, null }]])
EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
  EnumerableCalc(expr#0..1=[{inputs}], expr#2=[true], proj#0..2=[{exprs}])
EnumerableValues(tuples=[[{ 1, 2 }, { 3, null }]])
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: DISCUSS: merge calcite-avatica and calcite repositories

2021-11-12 Thread Vladimir Sitnikov

Josh>I am still upset from the porting of Avatica to Gradle

Please migrate Avatica to whatever build system you like.

I disagree with you a lot, however, I stop commenting as I do not see it
might change your mind.
I'm not touching Avatica anymore.

me>I am going to delete/ignore the offending tests in Calcite in 24 hours

Done. I commented out the test.
I hope we no longer add tests to Caclite that verify low-level Avatica
behaviour.

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

s/John/Josh/

sorry for the typo.

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

Josh>because I genuinely do not see how this approach
Josh>creates more code that needs to be written than if the projects were in
Josh>the same repository.

Compare
https://github.com/apache/calcite-avatica/blob/4cf769f8ee32bb3520604e4f3284e6f233f90276/build.gradle.kts
and
https://github.com/apache/calcite/blob/6d51d277b158ff7073f29ada4a96a7a74c0b46fc/build.gradle.kts

Those are duplicate files, and I had to create them twice.
If the code was co-located in a single repository, then a single copy of
the file would be enough.

John>(I am to assume) is a personal motivation for yourself.

Everybody who contributes to the build scripts, dependencies, etc, would
want to avoid duplication of their work.

John>Avatica 1.17.0 was released Jun 2020, 1.18.0 in May 2021. Calcite had 3
John>releases of its own between these.

In other words, Avatica could have three releases **for free**
Of course, you might argue there's "non-trivial procedure for releasing
Calcite",
however, I would claim it does not have to be that way.

"lots of manual steps for the release" add zero to the quality of the
release.
Of course, we might disagree here as well, however, if releasing Calcite is
painful,
then "forking Avatica into a different repository" is a wrong solution.

John>I can still build Avatica with Gradle 6.8.1.

Go ahead and try building Avatica with Java 17.
You will fail quite fast. Have a fun day.

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

Josh>I have no objections to combining these two repositories together.

Why don't we just merge the repositories and move on then?

Josh>If we identify these problems, we can think
Josh>through (from the beginning) if combining repositories is the only
Josh>solution or just one possible solution (and weigh the pros and cons for
Josh>each).

I am sure I did explain the pain points. See points 1..5 (the very
beginning of the mail) in the very first message in this thread.
If that is not enough, I can't really explain better.

CI configuration is duplicated across repositories.
JUnit versions maintenance overheads are duplicated.
Build configuration is duplicated (e.g. code style rules, boilerplate build
scripts, etc)
Cross-repository PRs are non-trivial.
And so on.

If you have suggestions on how to bump JUnit and CI configuration in both
repositories at once,
you are welcome, however, the only alternative "solution" I expect is to
ignore the problem
and treat it as "not a problem".

I acknowledge merging repositories will make some use cases harder.
However, I am sure points 1..5 (see the very first mail) have no solution
besides repository merge.

It is clear that having the repositories separate has no real benefits.
Community is the same, committers are the same, language is the same,
releases are co-located.

There are theoretical reasons like "different repositories allow different
committers in theory" or
"different repositories allow different release schedules" or "because
people perceive that they can use component A without using component B"
However, I am sure those "reasons" have no weight. They are theoretical,
and some of them turned out to be false.
For instance, the release schedule is co-located across several years.
Even at the time of the Avatica split, somebody said that Solr community
did **merge** repositories after a while
exactly because the committers were effectively the same.

Of course, somebody might **believe** that "because people perceive that
they can use component A without using component B"
is way more important than a non-important duplication of maintenance of
CI, dependencies, build scripts across both repositories.

You know what? I am fine to accept people might have those beliefs.
If so, I just let them do the maintenance.

Of course, you might claim that your ability to watch "avatica PRs only" is
far more important.
Is that really the key point of having separate repositories? Really?
GitHub has "code owners" and "labeling" features, so you can watch the
relevant parts of the repository.

Josh>These repositories are separate because they have separate release
Josh>schedules

The release dates for Calcite and Avatica are co-located (see my mail
above).
The feature of "separate release schedules" is not really used in practice.

Having the same version number for Calcite and Avatica would event make it
easier for clients to pick the compatible versions.

Josh> The net amount of code you have to write doesn't
Josh> change with how it works now compared to how you are suggesting it
Josh> should work.

This is false. For instance, "CI configuration, JUnit versions, build
scripts" are **duplicated** across repositories,
so if the repositories are different, then the amount of work is increased.

Here's one more data point: Calcite is built with Gradle 7.3, and Avatica
is stuck with Gradle 6.8.1
Of course, the duplication of work does not exist.

You say "the net amount of code I have to write doesn't change".
You are right. I won't contribute to calcite-avatica repository anymore, so
there's no duplication of work for me.

Josh>If there are breaking changes in Avatica, they would need to be
Josh>accounted for when Calcite is bumped to the next version of Avatica.

calcite-avatica/pull/161 would be way easier if both avatica and calcite
were in a single repository.
There would be no need to think of version compatibility, etc.

Josh> Having a separate repository makes it
Josh>extremely easy for me to watch Avatica commits/pull-requests which I am
Josh>capable of reviewing vs. Calcite pull-requests which I am not
Josh>comfortable reviewing.

Have you considered mail filters and/or code owners
https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
?
There's an option to automatically label PRs according to the modified
files via GitHub Action.

Josh>rather than starting from the root problem "How can we
Josh>make co-dependent changes easier?"

"co-dependent" changes are not the only problem.
It was just a trigger.

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

Ok, I did not care when Avatica was split into its own repository, and I
assumed Avatica team would maintain the code.
That turned out to be not quite true, and it turns out there are
non-trivial overheads coming from the split.

Of course, everybody can disagree, so if you disagree, you don't need to
explain that. We just disagree and that is it.

Now I do not care how Avatica would evolve further.
It might be even good because Vladimir S won't bother Avatica with mails,
PRs, and other random items.


What I care much more about is that Calcite tests are broken, the CI
reports failure for every commit, and that is really insane.
Calcite CI must not fail on every build.

I am going to delete/ignore the offending tests in Calcite in 24 hours, and
I let the others deal with the case (fix tests, fix Quidem, merge Avatica,
or whatever).

I would no longer review calcite-avatica PRs.



Julian>I was release manager for Calcite and Avatica this time around and
it took me about a week

Julian, as the repositories are merged, there will be a **single** release
process that releases both Calcite and Avatica in a single go.
It would reduce the friction, and it would reduce cases where the release
procedures and steps are "slightly different".

It looks like you assume the release steps would never change, however, in
practice, the drift of Java, Docker, OS, and so on versions would
affect the release steps anyway.

That is why having Avatica and Calcite in a single repository would
simplify the release procedure.

Julian> And now we’re going to change the process again?

I propose to *drop* the release process for "calcite-avatica-java". The
release of "calcite" would release both things (in a single
calcite-x.y.z.tar.gz file),
so the release steps for Calcite would be intact.

Technically, that changes the process.
In practice, I find that you overreact. The merge would make the release
easier.

On top of that, it was you who voted for splitting Calcite and Avatica, so
it was your conscious decision to accept that "release management would get
harder after the split"

Julian>Any supposed “time savings” due to the merged repositories will be
eaten up by dozens of small issues that will crop up long after the merge

The words are not justified. There will always be some random small issues,
even in the case we keep repositories separate.
On the other hand, there are true issues when repositories are different:
* We have two different release processes, so RM have to know the
differences
* we have slightly different dependency versions, so things like "bump
junit" have to be duplicated
* CI configuration is duplicated across the repositories, and it does have
non-zero maintenance overhead
* cross-repository refactorings are harder to do (like the fix for the
exception message)

Julian, I did put four issues that are real, and they induce noticeable
maintenance churn. Java and JUnit will keep updating and you can't prevent
that.
Please highlight the problems exactly and please stop saying "time savings
will be eaten by dozens".
That is not a healthy discussion.

>Also, we will lose the history of the Avatica source code. All of the
commit hashes for bugs fixed long ago will change

We can make calcite-avatica repository read-only, so all commits will be
there.

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

2021-11-09 Thread Vladimir Sitnikov

Michael> is not proposing to change the
Michael>structure of modules within both projects, merely to have the code
for
Michael>both in a single repository.

I propose to integrate them into a single build, and keep the set of the
published jars.
However, the modules and dependency structure could be kept as is.

We might want to rename folders like
calcite-core
calcite-linq4j
..
avatica-core
avatica-server

However, I am not sure it is that important to discuss folder names now.
The idea is that as you "open Calcite in IDE, you see both Avatica and
Calcite modules"

Michael>Is there any reason we couldn't have a separate release schedule if
Michael>both projects are in the same repository?

A different schedule means Calcite must support at least two different
Avatica versions.
In other words, if we allow clients to update Avatica at their will, then
we should allow them building Calcite with different Avatica versions,
which implies Calcite test code should succeed for multiple different
Avatica vesions.

That makes it harder to write tests: we have to execute tests with two
different Avatica releases (or even more than two).

There are at least two sources for complexity:

a) We have to write tests that tolerate multiple versions. For instance,
"if (avatica.18+) {...}" and so on.
That is not really trivial, especially taking into account some of the
tests are created with non-yet-popular
technologies like Quidem where you can't really google solutions. So the
"trivial" task of "making a test to expect two possible outcomes"
becomes less trivial as you try to pass the version from GitHub Action to
Gradle to JUnit to Quidem to no-one-knows-which class.
If we support one Avatica version only, that is not needed. We just patch
the test in Avatica and Calcite and that is it.
Single repo avoids "Gradle vs Quidem" dance.

b) If we claim that we support 5 different Guava versions, 3 different JDK
versions, 2 different Avatica versions,
then we have to execute 5*3*2 = 30 combinations of the tests.
That is not really a full matrix, however, things get way easier if we
support one Avatica version only.
The amount of tests we need to do during a proper release is much less, and
it is easier to commit
changes that touch Avatica and Calcite at the same time.


Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

Francis>I remembered joining this list a few years back then

Francis>- Calcite - SQL engine used to build the SQL part of your database.

The things is "apache/calcite" is tons of features as in "thing that
converts SQL to MongoDB" or "thing that executes SQL queries over InnoDB
data files",
or "thing that validates SQL", or "linq4j".
There are lots of features, and there are lots of artifacts.
The set of modules is not perfect, and there are cases when somebody needs
"just optimizer",
and there are valid reasons to hate "Enumerable" or "dynamic bytecode
generation".

However, it looks like having Avatica in the same repository does not hurt
much as long as the other features do not impact Avatica features much.

Francis>Each should be able to stand up on its own and have its own
independent release schedule

Having a different release schedule means we must support at least two
different Avatica versions in Calcite.
That adds overhead on Calcite development as we have to make test code to
account for different Avatica.

Francis>I've seen PRs that need to touch both repositories for
Francis>a change and it seems like a code smell to me

Even though testing the "exact exception message" at Calcite side looks
weird, it is important to test
how the exception would look like for the end-user, so there must be an
integration test that ensures
Avatica produces proper exceptions, and Calcite does not make them bad
(e.g. by ignoring the message).

Francis>own independent release schedule

For now, Calcite Avatica is stuck with Gradle 6 since it uses "dependencies
between test source sets".
In other words, test classes in calcite-avatica/server depend on test
classes in calcite-avatica/core.
That is an anti-pattern, and the resolution is to move the common test
classes into a proper artifact with its metadata and dependencies.

This is something I fixed in Calcite in
https://issues.apache.org/jira/browse/CALCITE-4821

Here's the question: who would do the relevant work for calcite-avatica?

Vladimir

Re: DISCUSS: merge calcite-avatica and calcite repositories

I know the below is too wordy, however, English is not my native language,
so I tend to overexplain and duplicate thoughts.

Julian>To allow separate communities

This is something I do not understand.
Let me be explicit: I am OK to maintain bits of avatica code when Calcite
fixes overlap with Avatica for some reason (e.g. throwing exceptions).
I am not OK with duplicating the same work just because someone else wants
to have a separate repository.

I was maintaining and releasing multiple (four!) repositories for pgjdbc
when it was using Maven
(Maven was not able to release jre6, jre7, and jre8 artifacts from within a
single repository), and it was a big relief when the code was recollected
into a single repository again.

That is why I am advocating for merging avatica code back.

Julian>Splitting code into modules makes it easier to continue to splitting

I agree it is worth having calcite-core and avatica as separate jar
artifacts.
It is even worth splitting "enumerable" out of calcite-core into its own
module.
However, I do not see how splitting the repositories helps. I do not expect
somebody to use calcite-avatica as a git submodule or something like that.

Julian>to allow separate release schedules

Here are the recent releases.
There's often a pattern that "Calcite release follows Avatica release".
It is even mentioned in the mails where someone suggests releasing Avatica,
then releasing Calcite".

A notable exception was rel/avatica-1.16.0 which was not connected to the
release of Calcite.
The rest Avatica releases seem to be quite close to Calcite releases, so I
do not see what do we gain from "separate release schedules".
It looks like co-locating the release would be easier for both maintenance
overheads, testing, and consumption.


2016-03-16 rel/calcite-avatica-1.7.0

2016-03-17 calcite-1.7.0

2016-03-15 rel/calcite-avatica-1.7.1

2016-06-01 rel/calcite-avatica-1.8.0

2016-06-08 calcite-1.8.0


2016-09-17 calcite-1.9.0


2016-10-07 calcite-1.10.0

2016-10-27 rel/calcite-avatica-1.9.0


2017-01-03 calcite-1.11.0

2017-03-20 calcite-1.12.0

2017-05-23 rel/avatica-1.10.0

2017-06-22 calcite-1.13.0


2017-09-27 calcite-1.14.0

2017-12-05 calcite-1.15.0

2018-03-06 rel/avatica-1.11.0

2018-03-14 calcite-1.16.0

2018-06-21 rel/avatica-1.12.0

2018-07-16 calcite-1.17.0

2018-11-29 rel/avatica-1.13.0

2018-12-18 calcite-1.18.0

2019-03-20 calcite-1.19.0

2019-04-25 rel/avatica-1.14.0

2019-05-09 rel/avatica-1.15.0

2019-06-19 calcite-1.20.0


2019-09-06 calcite-1.21.0

2019-12-19 rel/avatica-1.16.0


2020-05-23 calcite-1.23.0

2020-06-22 rel/avatica-1.17.0

2020-07-23 calcite-1.24.0
2020-08-22 calcite-1.25.0

2020-10-06 calcite-1.26.0


2021-05-18 rel/avatica-1.18.0

2021-06-03 calcite-1.27.0

2021-10-04 rel/avatica-1.19.0

2021-10-19 calcite-1.28.0

Julian>The last of these was particularly on my mind when we split Avatica from
Calcite.
Julian>I was pleased to see, for example, Apache Phoenix using Avatica
successfully
(including building an ODBC driver)

Does it really help Phonenix that we split repositories?
I am sure they should not really depend on the repositories.
How would their life be harder if we release the same set of artifacts and
jars from a single source tree?

Julian>If Avatica had remained part of Calcite, both written in Java and
using the same build system and release process,
Julian>it’s less likely that Avatica-go would have happened

Does Avatica-go use calcite-avatica repository?
So far it looks like Avatica-go uses
https://github.com/apache/calcite-avatica-go, and I do not suggest merging
it into calcite repository yet.
I do not see how merging calcite-avatica and calcite repositories would
block the evolution or emergence of avatica-go, avatica-rust, etc.

Julian>release process

I identified https://issues.apache.org/jira/browse/CALCITE-4881 "Calcite
release tags should have rel/... prefix as per the ASF requirements"
It looks like an unintentional breakage in
https://github.com/apache/calcite/commit/2e30293af7373b6c5fbcc5fa6505b49df2fba000
caused
by the split.

Vladimir

[jira] [Created] (CALCITE-4881) Calcite release tags should have rel/... prefix as per the ASF requirements

2021-11-08 Thread Vladimir Sitnikov (Jira)

Vladimir Sitnikov created CALCITE-4881:
--

 Summary: Calcite release tags should have rel/... prefix as per 
the ASF requirements
 Key: CALCITE-4881
 URL: https://issues.apache.org/jira/browse/CALCITE-4881
 Project: Calcite
  Issue Type: Improvement
Reporter: Vladimir Sitnikov


As per "Git status update" mail from Infra, release commits should be tagged 
with rel/... tags which are protected at the Infra side so they can't be 
overwritten, removed, etc.

See sample: ACCUMULO-4106



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Re: DISCUSS: merge calcite-avatica and calcite repositories

Currently CI for avatica verifies building with calcite/master, and vice
versa.
Otherwise we won't notice integration issues before we release.

Julian suggested that we should support one avatica version only. I agree,
especially of that means we merge the repositories back.

>  I think it is a mistake to build against master

We expect certain level of backward compatibility, so there are reasons to
build with latest commits.

However, if the versions are always aligned in practice, then having a
single repo seems reasonable.

One more solution is to disable/remove the offending test in Calcite.

Merging the repos still make sense for me, hovewer, I might miss some use
cases.

Vladimir

DISCUSS: merge calcite-avatica and calcite repositories