[jira] [Created] (CALCITE-6112) Use inedible release tags
Vladimir Sitnikov created CALCITE-6112: -- Summary: Use inedible release tags Key: CALCITE-6112 URL: https://issues.apache.org/jira/browse/CALCITE-6112 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov The ASF has recommended using inedible Git tags since 2016: https://lists.apache.org/thread/szbtzk0048ppx1zvzljbrq7by2mt1zvs It turns out that has broken in Calcite since 2020: https://github.com/apache/calcite/commit/2e30293af7373b6c5fbcc5fa6505b49df2fba000 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6019) Excessive cast when UDF receives nullable JavaType
Vladimir Sitnikov created CALCITE-6019: -- Summary: Excessive cast when UDF receives nullable JavaType Key: CALCITE-6019 URL: https://issues.apache.org/jira/browse/CALCITE-6019 Project: Calcite Issue Type: Bug Components: core Reporter: Vladimir Sitnikov See CALCITE-6018, however, CALCITE-6018 is about the "ability to declare nullable, non-optional parameter" while this issue is about the excessive NOT NULL cast. Just in case, adding {{@Strict}} to {{retainedSize}} does not make CAST go away. It causes weird "cast(... NOT NULL)" even though my {{retainedSize}} function processes null just fine. {noformat} expr#4=[TO_HEAP_REFERENCE($t3)], expr#5=[CAST($t4):JavaType(class java.lang.Object) NOT NULL], expr#6=[retainedSize($t5)], {noformat} {{TO_HEAP_REFERENCE}} returns nullable {{HeapReference}} (Java class): [https://github.com/vlsi/mat-calcite-plugin/blob/4d4aa2284eeec69bc51da0c2e769ded06ef9ab97/MatCalcitePlugin/src/com/github/vlsi/mat/calcite/schema/objects/HeapOperatorTable.java#L27-L28] {{retainedSize}} receives a single (nullable) {{Object}} parameter, and it is declared as [https://github.com/vlsi/mat-calcite-plugin/blob/4d4aa2284eeec69bc51da0c2e769ded06ef9ab97/MatCalcitePlugin/src/com/github/vlsi/mat/calcite/functions/HeapFunctions.java#L129] I tried adding {{@Parameter(optional=true)}} to {{{}retainedSize{}}}, and it does not eliminate {{NOT NULL}} part of the cast. My expectation is that NOT NULL must not be there as both return type and the parameter type are nullable. It might be cast should not be there as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6018) Support nullable parameters in UDFs
Vladimir Sitnikov created CALCITE-6018: -- Summary: Support nullable parameters in UDFs Key: CALCITE-6018 URL: https://issues.apache.org/jira/browse/CALCITE-6018 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov Currently Calcite treats almost all the parameters as non-nullable. There's {{@Parameter(optional=true)}}, however, it mixes "optionality" vs "nullability". It causes weird "cast(... NOT NULL)" even though my function processes null just fine. See example in CALCITE-6012. {noformat} expr#4=[TO_HEAP_REFERENCE($t3)], expr#5=[CAST($t4):JavaType(class java.lang.Object) NOT NULL], expr#6=[retainedSize($t5)], {noformat} {{TO_HEAP_REFERENCE}} returns nullable {{HeapReference}} (Java class): https://github.com/vlsi/mat-calcite-plugin/blob/4d4aa2284eeec69bc51da0c2e769ded06ef9ab97/MatCalcitePlugin/src/com/github/vlsi/mat/calcite/schema/objects/HeapOperatorTable.java#L27-L28 {{retainedSize}} is declared as https://github.com/vlsi/mat-calcite-plugin/blob/4d4aa2284eeec69bc51da0c2e769ded06ef9ab97/MatCalcitePlugin/src/com/github/vlsi/mat/calcite/functions/HeapFunctions.java#L129 I tried adding {{@Parameter(optional=true)}} to {{retainedSize}}, and it does not remove the cast. It does not eliminate {{NOT NULL}} part of the cast. --- In any case, I think it is not quite right to mix nullaness and optionally concepts, and Calcite should inter nullness from nullability annotations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6012) CAST($t4):JavaType(class java.lang.Object) NOT NULL causes NPE since 1.27
Vladimir Sitnikov created CALCITE-6012: -- Summary: CAST($t4):JavaType(class java.lang.Object) NOT NULL causes NPE since 1.27 Key: CALCITE-6012 URL: https://issues.apache.org/jira/browse/CALCITE-6012 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.27.0 Reporter: Vladimir Sitnikov The commit that introduced the regression is https://github.com/apache/calcite/commit/e819b4611e883c54708a75f6856300462c92b8ae Here's UDF declaration: https://github.com/vlsi/mat-calcite-plugin/blob/4d4aa2284eeec69bc51da0c2e769ded06ef9ab97/MatCalcitePlugin/src/com/github/vlsi/mat/calcite/schema/objects/HeapOperatorTable.java#L24-L36 Problematic stacktrace is below. Frankly speaking, it is strange to see scaleIntervalToNumber in the stacktrace as the types have nothing to do with intervals. {noformat} Caused by: java.lang.IllegalStateException: Unable to implement EnumerableAggregate(group=[{}], retained_size=[SUM($0)]): rowcount = 8.0, cumulative cost = {169.1038146973 rows, 721.0 cpu, 0.0 io}, id = 52 EnumerableCalc(expr#0=[{inputs}], expr#1=[0], expr#2=[GET_SNAPSHOT($t1)], expr#3=[GET_IOBJECT($t2, $t0)], expr#4=[TO_HEAP_REFERENCE($t3)], expr#5=[CAST($t4):JavaType(class java.lang.Object) NOT NULL], expr#6=[retainedSize($t5)], $f0=[$t6]): rowcount = 80.0, cumulative cost = {160.0 rows, 721.0 cpu, 0.0 io}, id = 54 EnumerableTableScan(table=[[HEAP, java, util, $ids$:HashMap]]): rowcount = 80.0, cumulative cost = {80.0 rows, 81.0 cpu, 0.0 io}, id = 36 at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:111) at org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:112) at org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1124) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:311) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:642) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:508) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:478) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:231) at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:213) ... 64 more Suppressed: java.lang.NullPointerException at org.apache.calcite.adapter.enumerable.RexToLixTranslator.scaleIntervalToNumber(RexToLixTranslator.java:936) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.translateCast(RexToLixTranslator.java:592) at org.apache.calcite.adapter.enumerable.RexImpTable$CastImplementor.implementSafe(RexImpTable.java:2481) at org.apache.calcite.adapter.enumerable.RexImpTable$AbstractRexCallImplementor.genValueStatement(RexImpTable.java:2937) at org.apache.calcite.adapter.enumerable.RexImpTable$AbstractRexCallImplementor.implement(RexImpTable.java:2902) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:1134) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:90) at org.apache.calcite.rex.RexCall.accept(RexCall.java:175) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:1018) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:90) at org.apache.calcite.rex.RexLocalRef.accept(RexLocalRef.java:75) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.implementCallOperand(RexToLixTranslator.java:1143) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:1130) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:90) at org.apache.calcite.rex.RexCall.accept(RexCall.java:175) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:1018) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitLocalRef(RexToLixTranslator.java:90) at org.apache.calcite.rex.RexLocalRef.accept(RexLocalRef.java:75) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:237) at org.apache.calcite.adapter.enumerable.RexToLixTranslator.translate(RexToLixTranslator.java:231
[jira] [Created] (CALCITE-6010) RelRule.Config requires too much ceremony for trivial cases
Vladimir Sitnikov created CALCITE-6010: -- Summary: RelRule.Config requires too much ceremony for trivial cases Key: CALCITE-6010 URL: https://issues.apache.org/jira/browse/CALCITE-6010 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov It takes a lot to implement RelRule.Config even for trivial cases when no extra configuration options needed. Sample: https://github.com/apache/calcite/blob/d9dd3ac8a9f695e111a0a5e77f45b61b90f4b5b6/core/src/main/java/org/apache/calcite/rel/rules/TableScanRule.java#L72C35-L78 It requires users: 1) Use immutables or implement Config manually. Adding immutables processor adds compile-time overhead, and implementing the interface manually is not trivial 2) Implement CustomRule(Config) constructor 3) Implement Config.toRule() by calling CustomRule(Config) --- I suggest: 1) Drop method Config.toRule(), and suggest users to call new ...Rule(config) directly. 2) Provide default Config implementation along with Calcite. For instance DefaultConfig.EMPTY, DefaultConfigBuilder... 3) Use composition for custom configurations, in other words, let custom rules have their own attributes, and one of the attributes could be default config. For instance: {{data class AggregateExpandDistinctAggregatesRule.Config(config RelRule.Config, usingGroupingSets Boolean)}} It would make it easier for users to implement config objects, and it would reduce the code size (generated bytecode and native image size) as the current Calcite approach duplicates the same "Config" methods like withOperandSupplier across all the Config implementations -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5034) Disable Gradle build cache or use an alternative accout for it
Vladimir Sitnikov created CALCITE-5034: -- Summary: Disable Gradle build cache or use an alternative accout for it Key: CALCITE-5034 URL: https://issues.apache.org/jira/browse/CALCITE-5034 Project: Calcite Issue Type: Task Components: build Reporter: Vladimir Sitnikov Attachments: bucket-utilization-cache.csv Calcite uses build cache since CALCITE-4140. I'm paying for the service (~$6/month), and there's a risk bank will block further payments (I'm from Russia :-/ ), so I think it is time to switch to another account or disable the build-cache. I've chosen Wasabi S3 since they have a free egress policy, so the costs are predictable. Cloudflare R2 is not yet ready: https://blog.cloudflare.com/introducing-r2-object-storage/ Rough stats so far Total storage=15 GiB Upload=0.1-0.2 GiB/day Download=1-2 GiB/day -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: [DISCUSS] Trolls and community
>[2] http://ceki.blogspot.com/2010/05/forces-and-vulnerabilites-of-apache.html ^^ It is a must-read indeed. I wish I read it much earlier than a week ago. Julian, as I see you did not reply to [1]. It implies you ignore my suggestions in that thread, or you think they are worse than what you suggest. Jokes aside, I see no way how "manual updates of the latest tested java in all the markdown files" (Julian's idea) could win over "single place to declare the version and let the machine do the substitution in all the .md files" (my idea). Just in case, Julian never asked me if I would volunteer implementing the autoreplacer. Julian, I was scared you quit, and my last hope to heal the community was to organize a call to discuss Ceki's story and try to avoid it in Calcite. I quit as I see there's an impedance mismatch. Thank you for inviting me, that was a nice experience. I learned a lot during these years. I would like to thank Stamatis a lot as he has done a lot for improving the community via meetups, nominating committers, and helping others to get out of the conflicts. The thing I dislike in Calcite community the most is that **very** often people start questioning (or even reject) ideas, they provide no alternatives, and they do not seem to listen. I agree it is wise to give every idea -100 points at the beginning, and then decide if the benefits make the net score positive. However, the community seem to ignore benefits. I absolutely do not want to hear risks like "oh, adding a new language is a serious committment, so we should not add X" or "jira is fine, no changes please". Those are generic risks, and I believe they add no value to the conversations. What scares me A LOT is that I see a very similar pattern in JMeter community (I'm a member of PMC there), however, the number of active contributors is way less than in Calcite, so the friction is way less. Do we really need to discuss the upgrade of junit4 to junit5? Do we really need to discuss adding fuzzing tests (e.g. like in my suggestion to use randomized CI matrix)? Hey, fuzzing is one of the key testing approaches nowadays. Yet Calcite community rejects it and they fear that random failures are bad. Come on. You can do better. Do we even need to discuss that the release notes should be autogenerated (see release drafter)? In my opinion, there's nothing to discuss and nothing to object, and the only question can be "who volunteers" doing that. I often implemented those types of changes, yet the discussions were far from welcoming. Of course, the experience was not bad every day. For instance, "Maven -> Gradle" migration got a lot of positive feedback immediately. https://issues.apache.org/jira/browse/CALCITE-2905 https://lists.apache.org/thread/jw3ovlv5tp3vjcp0o5ztcv8yrzd2okl9 ^^ this was a joy However, as I presented the first prototype, I got a really surprising response: "This is not an improvement because it adds a new Kotlin language in a form of .kts files" https://lists.apache.org/thread/plb3rjypqrwo34qr3o2fdh2qqofx04y9 ^^ this is not fun. I know what I am doing. I am sure the scripts I added are readable and understandable. I know I use up-to-date approaches. Please, what is the reason to mention a generic risk of "adding a new language"? - Later I rolled @nullable annotations via Checkerframework https://issues.apache.org/jira/browse/CALCITE-4199 I suggest everybody take a break, think for a couple of seconds and decide if @nullable annotations in Calcite code turned out to be helpful or unhelpful (for both developing Calcite, and using Calcite). >From my point of view, it was super frustrating to receive feedback like in https://lists.apache.org/thread/tyxqydxwbt6lkokgobjok083nxqjrhlx >Vladimir has been trying to fix things that aren’t broken My bad I did not try contacting Julian directly to align approaches to commits, reviews and broken/nonbroken things. Of course, if you consider @nullability annotations harmful and/or useless, all of them can be removed in a matter of seconds (remove annotations; make requireNonNull methods trivial and call "inline method in IDEA"; optimize imports; etc) Of course, it would be better to gradually migrate to Kotlin (it does not require lengthy checkerframework verifier), however I just assumed the idea would be killed no matter what. For instance, Kotlin has the official style guide, so it makes many checkstyle verifications obsolete == much less failures like "you missed a space after comma". Recently I suggested "migrating to GitHub Issues". https://lists.apache.org/thread/m2h13v2p7kowglj73qr4sqn1c3pm8tlq Technically speaking, the results in the DISCUSS thread were not that bad +1 Vladimir Sitnikov +0.5 Zhe Hu +0.5 Jing Zhang +1 Chunwei Lei +0 Michael Mior -0 Josh Elser (no vote) Stamatis Zampetakis -0.47 Jacques Nadeau (no v
Re: Why not make the "test" namespace public, or move to a separate Maven module that main pulls in as a "testOnly" dependency?
Gavin, Please check sample schemata in https://github.com/apache/calcite/tree/7a7b37fc1f4bd2121643539b0fd8e9c4f0f8ed09/testkit/src/main/java/org/apache/calcite/test/schemata If you find useful schemas, don't hesitate filing a PR to add samples to /testkit/ Vladimir
Re: Release notes preparation via release-drafter
>I'm afraid this is not accurate. Nothing is accurate, so what do you suggest? Vladimir
Re: Release notes preparation via release-drafter
>This is an accurate label ? I just want to confirm. Please rephrase. I don't understand it. Vladimir
Re: Release notes preparation via release-drafter
>This will request to add another Label to every PR Release drafter can automatically add labels based on the filenames. PRs without labels are still acceptable, so having labels is not a requirement. Vladimir
Release notes preparation via release-drafter
Hi, I suggest we start using Release Drafter [1] for the preparation of the release notes. I'm going to submit and merge a PR soon, however, I would appreciate it if somebody else would do that (see [2] for a minimal commit) I am sure it would significantly increase the quality of the release notes and it would make releases easier to prepare. Jenkins team incepted release-drafter, and they use it a lot. Here's a sample configuration for Jenkins projects: https://github.com/jenkinsci/.github/blob/f1a34987f2919f039282a13b8741e3c463d18bc6/.github/release-drafter.yml Here are the results: https://github.com/jenkinsci/configuration-as-code-plugin/releases [1] https://github.com/release-drafter/release-drafter [2] https://github.com/vlsi/ksar/commit/ec210ffb7d0f91631908d01976c8bb4d5f7c4e0d Vladimir
Re: [DISCUSS] Java 16 and 17 support
Julian, Please refrain from insisting that others should follow the manual procedures you create. If you want to have certain manual steps for your own safety, that is fine. However, don't expect the others will be following all the manual steps you invent. I suggest we gather a meetup to discuss exactly that. I believe it hurts the community A LOT when people object without providing better alternatives. I believe it hurts the community when people try solving **non-existing** issues (see [3]). I believe it hurts the community when people keep saying "why change, it works for me" and ignore all the justifications. I believe it hurts the community when people object over a triviality. Please, take it seriously. I do not really want to see the story of "key committers leaving the project as it happened in log4j 1.x, Mina, etc" Do you want to see "latest tested Java at the Calcite website"? Nice. Why don't you make the version appear automatically on all the pages that need it? If you lack time, why don't you ask somebody to volunteer on that? What is the point of asking people doing manual steps and blaming me and Rui for inaccurate following the steps you created? We followed your suggestions, and we are here: I forgot editing howto, and Rui missed it as well. Could you please understand adding manual steps NEVER really solves the "documentation being stale" issue? Julian>you were not prepared to accept simple solutions Your suggestion regarding howto.md is not a solution, so I am not prepared to accept non-solutions. I suggested multiple trivial (!) proposals to keep the documentation **always** up to date. You can ignore them, yet, please don't create obscure manual procedures for others to follow. A solution in [1]: Replace "Prerequisite is ... Gradle (version 7.2) on your path" with "Prerequisite is ... Gradle (see version in gradle/wrapper/gradle-wrapper.properties) on your path" ^^ single line change yielding always up-to-date documentation, and it works for ALL Calcite versions, including the past ones. Note: there's no harm in keeping gradle-wrapper.properties in the source release, so A solution in [2]: Replace "Prerequisite is ... Gradle (version 7.2) on your path" with "Prerequisite is ... Gradle {% gradle_version_from_wrapper_properties %}" The value should be replaced automatically when building the site. It might look slightly better on a website, however, it would be slightly misleading. If someone tries building a slightly old Calcite version, they would need to use a different Java/Gradle version. --- In the ideal world, the thing should be buildable automatically, or there should be a small README that explains which tools do you need to be installed. I do not really expect people to use howto.md for building from the source archive. [1] https://issues.apache.org/jira/browse/CALCITE-4575?focusedCommentId=17339990=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17339990 [2] https://lists.apache.org/thread/mdxdosofclqbp9p5osflwtor264757m4 [3] https://issues.apache.org/jira/browse/CALCITE-4575?focusedCommentId=17340320=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17340320 Vladimir
Re: [DISCUSS] Java 16 and 17 support
>For Gradle upgrades, there’s only one place to modify. Juilan, Are you willing to keep the place updated? If so, there's no need to raise mail threads, please just update it. Vladimir
Re: [DISCUSS] Java 16 and 17 support
Julian, I know it might be tough, yet: 1) I would like you to believe me I just forgot howto and the other places hard-code Java versions. Of course, "we support Java 17" is a notable change, and it does deserve to be in the release notes, however, the typical current process is to "file JIRA for notable changes", so I did just that, and created CALCITE-4829 Bump Gradle to 7.2 and test with Java 17 at GitHub Actions. I agree it is unclear how the release manager could deduce they should add "Calcite X.Y supports Java 17 now" to the release notes. However, I am sure that could have happened with ANY change. The release manager often is not in a position to judge which changes are notable for the release. 2) I previously suggested we should automate the replacement of "the maximum supported Java". I believe the current failure highlights that once again. We can never expect contributors to remember all the places Java versions can be mentioned. - Julian>then the following tasks need to be done: Julian>Update the Gradle version in howto.md... I am afraid it does not solve the root cause, so we should do something different. New Java and Gradle versions would appear, and we would keep forgetting to update all the places. I suggest we add a variable like max_tested_java_version somewhere (gradle.properties? somewhere else?) Then we use the variable for building the website and we would check CI does test with the given version. ^^^ this suggestion will stop those discussions once and forever. That is basically the same thing I suggested in CALCITE-4575. >Does Calcite now support Java 16 and 17? 17 is available, so there's no point in mentioning "Java 16 is supported". Current text: >Building from a source distribution >Prerequisite is Java (JDK 8, 9, 10, 11, 12, 13, 14 or 15) and Gradle (version 7.2) on your path I would suggest: >Building from a source distribution >Prerequisite is Java 8 to ${max_tested_java_version} and Gradle (version ${gradle_version_from_wrapper}) on your path Julian>Change the subject of Julian>https://issues.apache.org/jira/browse/CALCITE-4547 to "Support Java 16 Julian>and 17", -0.97 for mentioning "support Java 16" We do not support that version, we do not test with Java 16, and we do not want to add Java 16 tests as it would significantly increase test times. We just hope it would work. Of course, most likely Calcite would work with Java 16 since Calcite works with 11 and 17. However, please, Julian, please, let us stop mentioning that we support irrelevant versions? None of Java vendors provide LTS for 16. Vladimir
Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13
There are issues, and it does fail quite often. It is really frustrating. I can disable the test if you think it would be better. Based on past experience, we almost never revisit disabled tests, and it would look as if "server" passes tests, however, it corrupts data. Vladimir
Re: Failing task :linq4j:jandexMain when executing CheckerFramework
>First some annoying :core:checkstyleTest that interferes with files generated by Intellij IDEA, with hundreds of error messages similar to the following: We exclude generated files from the style checks via the following patterns: https://github.com/apache/calcite/blob/a6293a37dd4166e6fe69ae0c5ef7a89e6677b6ab/build.gradle.kts#L259-L276 Please add the relevant pattern and file the PR. However, it is really strange that you get the generated files into generated/sources/version/generated and javacc/javaCCTest/generated_tests folders. I don't understand why IDEA would generate files into those folders. It does not sound like your ImmutableMyProjectFilterRuleConfig is connected with javaCCTest. I would try to understand why do you get files into /javaCCTest/ and /version/ folders. A side note: Kotlin does have the official coding style, so there's a formatter that yields nice results and that is consistent with IDEs. So moving to Kotlin makes many Checkstyle rules obsolete. No joking. >Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0 It is very likely you are using an outdated version of javac which produced invalid bytecode. Please upgrade javac (e.g. to the recent patchset). Note: GitHub CI uses recent patchet updates, so you might fail to reproduce the issue. We do know javac versions 1.8 before u202 are extremely buggy, so we fail the build for those versions. It might be time to add the same verification for 11, and require 11.0.5 or later (in case of 11). >Is there an easy way to fix the problem? The easy fix is to bump the javac version. The funny coincidence is that Kotlin solves this issue as well: null safety is integrated into the regular compilation, so there's no need for a separate (and time-consuming!) checker framework compilation. Vladimir
Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13
The failure is known as https://issues.apache.org/jira/browse/CALCITE-4824 Does anybody use calcite-server? Does anybody have cycles to maintain/fix calcite-server? Can we just remove it? Vladimir
Re: Build failed in Jenkins: Calcite » Calcite-snapshots #13
Julian>We don’t need any more noise on the dev list. Julian at all, please consider ASF Jenkins failures as true failures rather than "infrastructure noise". The messages could probably be improved, however, I have not explored them. Jenkins won't send mails in case everything is ok. Jacques>if these are real failures or infra problems I did very that Calcite-snapshots job can pass. I did configure the job to remove ALL non-git files before each build, I configured the job to run with --no-build-cache. Technically speaking, the "workspace cleanup and --no-build-cache" are not needed, however, they imply that failures are likely caused by the bug in Calcite code rather than "infrastructure". - In this case, :server:test failed, see "tests tab": https://ci-builds.apache.org/blue/organizations/jenkins/Calcite%2FCalcite-snapshots/detail/Calcite-snapshots/13/tests?start=0 It looks like there's a concurrency issue in "server" (Calcite or Avatica) > Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Number of INSERT target columns (3) does not equal number of source items (2) > at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source) > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505) > at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:599) > ... 66 more 44,49c191,195 < +---+---+ < | I | J | < +---+---+ < | 1 | 2 | < +---+---+ < (1 row) --- > +---+---+---+ > | I | J | K | > +---+---+---+ > +---+---+---+ Vladimir
Re: RelNode semantic evolution vs adapter implementations
>some direct or indirect changes (metadata, rules, validator, etc) >may cause changes to plans around the given RelNode in the downstream >project Do you think changes like "metadata, rules, validator" can cause "wrong results" issues? What could be the issues caused by "metadata, rules, validator" changes? I know ObjectWeb ASM bytecode manipulation library uses the concepts of versions for quite some time, and it seems to work for them. As you subclass, say, ClassVisitor, you have to pass the API version which your visitor supports. Then ASM fails in the runtime if the visitor happens to be applied with a too recent bytecode. >and gets the source code of the test that he includes into his project and CI Source code and bytecode parsing could be an alternative option (for other cases), however, it is not clear how to tell if the rule in question has been updated to support the new Project/Filter contract. Vladimir
Re: RelNode semantic evolution vs adapter implementations
>practice I fear that >it will introduce a lot of red tape and false negatives I am afraid "wrong results and corrupted data in database" is way worse than "recompiling in case of Calcite update". >I don't recall >any feedback (positive or otherwise) about how we describe breaking >changes CALCITE-3183 was breaking in 1.21, and it was NOT listed in release notes at all. Everybody makes mistakes, so I would avoid using "release notes" as the only way to convey the breaking changes. In the ideal world, there should be a compilation failure (or runtime failure) in case of semantic change. >Maybe we can improve how we report breaking changes I'm afraid people would miss the notifications. They can upgrade across several calcite versions, so they might miss "one release inside of 1.21...1.28 has a significant breaking change" If we increased the major version number, then they would anticipate the change. However, the thing is that "the change in Project" does not necessarily break all consumers, and it breaks only those consumers that inspect Project contents and/or write new methods, etc. >CALCITE-3183 Trimming method for Filter In that case, the new field was added to LogicalFilter, so custom implementations of Filter nodes (e.g. MongoFilter) had no chance to notice the change. >CALCITE-4913 Correlated variables in a select list In the current PR 2623, the extra field is added to Project.java, however, old constructor is kept intact, so even classes that extend Project don't notice the semantics difference. My takeaway is that: 1) It looks like we should refrain from adding fields to "Logical*" rels without adding the corresponding fields to base classes. 2) It looks like we want to change the constructor signature, so the user's code breaks if they extended Project. As an alternative option is to keep old constructors and mark them as @Deprecated, however, that deprecation would mean "you get wrong results unless you support the new constructor argument somehow". 3) Unfortunately, the planner rules do not care which signature the constructors have, so the planner rules won't notice the semantic difference. So I would suggest we add per-relnode "version" (or "feature set"), so the planner rule could declare "I know what Filter up to v5 means". We do not change node semantics often, however, when we do we want to avoid accidental "wrong results". Vladimir
Re: Automatic maven snapshot releases
>Where is -Pasf defined/documented? https://calcite.apache.org/docs/howto.html#making-a-snapshot Vladimir
RelNode semantic evolution vs adapter implementations
Hi, It turns out Calcite's nodes can change semantics without notification for the end-users. Here are the release notes for Calcite 1.21, and it says **nothing** like "Ensure you handle Filter#variablesSet in case you implement Filter or in case you transform LogicalFilter in your rules" https://calcite.apache.org/news/2019/09/11/release-1.21.0/ On top of that, the in-core adapters fail to handle that properly. For example, MongoFilter discards Filter#variablesSet. Can we please stop changing the semantics of RelNodes or can we have a better way to detect the changes in the client code? What if we add a "version" property to the corresponding RelNodes, and we increment it every time the semantic changes? Then client APIs could be coded like "ok, I'm prepared to handle Project v4, and Filter v5" (e.g. make "version" required when registering a rule), and there will be a runtime error in case Calcite generates Filter v6 in runtime. --- Sample case: CALCITE-3183 Trimming method for Filter rel uses wrong traitSet CALCITE-4913 Correlated variables in a select list are not deduplicated The case there is that "correlation variables" are added to the Logical* nodes (e.g. LogicalFilter, LogicalProject). Unfortunately, that change is hard to notice: there's no compilation failure, and there's no runtime error. The old client code just discards "correlation variables", and I guess it would result in wrong results or something like that. Silent wrong results is a really sad outcome from the database. CALCITE-4913 / PR#2623 adds Project#variablesSet property as well, and I guess it would in the same hidden semantic loss. Vladimir
Re: Automatic maven snapshot releases
Jacques, which credentials are you going to supply to publish the snapshots to the ASF repository from GitHub? While I agree with moving towards GitHub actions, I would like to minimize the use of ASF credentials in non-ASF environments. Luckily, ASF infra supplies the relevant credentials for publishing to ASF Nexus, so a mere "./gradlew -Pasf publish" at ASF Jenkins is enough to publish the snapshots. Stamatis>In the past we used to have Jenkins jobs doing this The job was there, so we don't really need to discuss or vote on that. I've created the corresponding Jenkins job here: https://ci-builds.apache.org/job/Calcite/job/Calcite-snapshots/ It pulls GitHub every 5 minutes. I am not sure if GitHub webhook integration would trigger the build (it might need a config from INFRA), however, it is possible as well. Vladimir
Re: Enable "squash and merge" button in GitHub UI
I wonder if we can add GitHub action that would react to committers comment /squash by squashing the commits and pushing --force-with-lease to the fork branch. If pushing to the fork is not possible in GHA, then something like /merge that squashes, rebases and pushes to apache/calcite would be good as well. That is slightly less convenient than a UI button, however, it would be way better than spending time on trivial squashes. I think that approach would keep the author/committer info, and it would be easily accessible Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
Moving the reply to DISCUSS thread. Ruben>My vote might be biased because I have used Jira for many years (and I have Ruben>never used Github for issue tracking). It is really sad to hear. You authored 87 PRs for Calcite! You did use GitHub for tracking already. https://github.com/apache/calcite/pulls?q=is%3Apr+author%3Arubenada+is%3Aclosed Here's a good example: https://github.com/apache/calcite/pull/2329 PR description says the intention of the feature. There are overall and line-targeted comments. Creating GitHub Issue is not that different from creating a PR. Ruben>I have the impression that some of the problems described discussion are Ruben>not per se Jira-related, but they appear because we misuse the current Ruben>tools (and it would remain more or less the same if we switched from Jira Ruben>to GitHub Issues). Ruben, I claim that we split information for no reason. What is the real meaning behind having 50% of the review comments in JIRA, and 50% of the comments in PR? What is the real meaning behind putting "common description in JIRA, and putting NO description in the PR"? Does it help to have different markup languages for JIRA and GitHub? My main motivation is not "replace JIRA with Issues". That replacement alone has few benefits. The key improvement to the current process is creating PRs **with description**, and keeping the relevant discussion in the PR itself. I really see no point in PR comments like "let's continue in JIRA". If a comment targets a specific PR, then why don't we just use PR's comment fields for that? Ruben>high level discussion (e.g. feature design) should Ruben>happen in Jira The vast majority of the changes are small. Adding Jira barrier for every change is a productivity killer (for both contributors and for the reviewers). I don't mean just "switch between JIRA and PR". I mean that comments are duplicated at different sites. The description of the issue and the code changes are separated. The notifications are duplicated (gitbox duplicates GitHub notifications). The markup format is different. The search is not coherent (there's no way to find issues and PRs at the same time). GitHub blame does not show the full picture (it shows PRs, however, opening the JIRA is a separate step). PR (and GitHub Issue) can be associated to a version, so it would be much easier to relate the PR with its release version. I think the list can go and go. Ruben>IMO every change must have a dedicated Jira ticket What are the benefits of "a dedicated Jira ticket"? How "a dedicated Jira ticket" is different from "a dedicated PR"? Ruben>high level discussion (e.g. feature design) should Ruben>happen in Jira, low-level details (e.g. line by line code review) in the Ruben>PR; Both can be in PR, so what are the benefits from splitting "low-level" vs "high-level"? There is a lot of cases when people discuss low-level details in JIRA. It would be so much better, if we aligned the review approach. Jira and GitHub are just tools, and if we don't agree to review high-level stuff first, then no tool helps. That implies, GitHub is just fine for high-level reviews. Ruben>like customized dashboards, or linked issues ("issue A is blocked by / Ruben>caused by / duplicated by / ... issue B"). GitHub allows links, and links have no properties like "caused by", "duplicated by". However, basic links are just enough for Calcite. If you have a "customized dashboard" that is useful for many contributors, please show it. The ASF is a set of volunteers, so it is not right to enforce usage of JIRA when only a single person needs "a customized dashboard". Vladimir
Re: [VOTE] Move to GitHub PR-centric development workflow, phase out JIRA
>I think JIRA works well. I would appreciate it if you could clarify. What I suggest would work better than JIRA in virtually all the cases. So why stick with JIRA? Vladimir
[VOTE] Move to GitHub PR-centric development workflow, phase out JIRA
Hi, Let us have a vote for PR-centric workflow for Calcite. [ ] +1, let's move to GitHub PRs/Issues for change management. [ ] -1, keep using JIRA because, Here's my vote: +1 (binding), let's move to GitHub PRs/Issues for change management. Note: dev@calcite keeps as is. I do not suggest replacing it. I suggest the adjustment of development workflow would reduce maintenance costs, make contributions easier, and make reviews easier. Note: please don't consider this as "replacing JIRA with GitHub issues". I understand that a mere replacement of system X with system Y often brings nothing. See the relevant discussion thread: https://lists.apache.org/thread/m2h13v2p7kowglj73qr4sqn1c3pm8tlq Current state We already allow committing changes to Git without PR, and without JIRA ticket. We already allow submitting PRs without JIRA. I do not mean we encourage that, however, there's no strict rule like "every commit must have JIRA". The pattern I see is: * contributors spend time figuring out if there's a bug in their system or if there's a bug in Calcite * then they create JIRA with all the details * they submit a PR with the relevant change right away * they copy the same-looking information multiple times: git commit, JIRA description, PR description (it is often left blank though) The discussion often separates between JIRA and PR. PR is super-convenient for line-based comments, on the other hand, "bigger comments" are put in JIRA (for unknown to me reasons). It is extremely sad when JIRA comments refer to specific bits of the code. We even have a label "discussion-in-jira" that means "there's a discussion JIRA, deal with it somehow". Suggestion Make PR-first a common approach for most of the changes. Small changes like adding tests, fixing documentation could come as pull requests without creating the corresponding JIRA ticket. https://github.com/apache/calcite/pull/2628 (Add test for 'a IS NOT NULL...) All the comments from https://issues.apache.org/jira/browse/CALCITE-4917 could be put to the PR2628 itself. There's no need for a separate "ticket" or "issue" since PR itself could explain the reason for the change quite well. I suggest we put "high level" comments into the PR itself. The same works for bigger features. A bigger feature can appear as a draft PR, then we shape it in comments, and merge it. A complex feature can appear as a GitHub issue that references other issues or PRs. Sometimes we do not have a solution, so we might create GitHub Issue or send dev@calcite mail for that. In practice, the suggestion does not change much, however, I suggest removing the duplication of work (no need to copy the information across PRs and JIRAs), it makes merge simpler (no need to update JIRA), it makes searching issues easier (there would be a single field to search for issues). Release management Both GitHub PR and GitHub Issue can be associated with a milestone (~version), so we could use it to track which release includes what. Issue history We could keep old tickets in JIRA and use GitHub for new issues only. That is the most trivial approach, Airflow team did that, and we might be ok with that. An alternative option is to copy issues (including resolved ones). I guess attachments would be hard to clone, however, copying issues and comments should not be a problem. Other I might miss certain bits, however, I suggest figuring out the solutions later. I am sure GitHub covers all the cases we need, and we have dev@calcite as a fallback. Vladimir
Re: Enable "squash and merge" button in GitHub UI
We have many committers, and it might make sense to keep track of who commits what. Squash and merge behaviour changed in 2020 :( https://github.community/t/authorship-of-merge-commits-made-by-github-apps-changed/2971/33 Vladimir
Re: Enable "squash and merge" button in GitHub UI
>I've never experienced this with other (non-Apache) projects. https://github.com/JetBrains/kotlin-wrappers/pull/969 git show --format=full a3ff898f4f4e36d5a8bba9c1cbbefa924a9f5386 => commit a3ff898f4f4e36d5a8bba9c1cbbefa924a9f5386 Author: Victor Turansky Commit: GitHub :( Vladimir
Re: Enable "squash and merge" button in GitHub UI
Thanks for the pointers. The committer info is lost indeed. I would add the comments to asf.yaml so we can re-iterate later. Vladimir
Re: [DISCUSS] Writing rules in dependent projects without depending on Immutables
>Has anyone else run into this problem? I've no idea how hard it would be to solve, however, it looks like something worth doing. It would be sad to require everyone to add annotation processors to their builds. Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
Julian, it looks like you have not been using GitHub for the past 5 years, and you miss a lot of improvements. GitHub allows a superset of what we do in JIRA, so I really can't understand why do you suggest enforcing JIRA. None of what you say sounds like a "good point to keep using JIRA", and none of what you say sounds like "a pitfall of using GitHub". >As to the backlog of PRs I could have just merged 2628, however: 1) We spend time on discussions in JIRA 2) After PR merge, someone would have to "resolve the JIRA" That consumes the reviewer's time. Julian>GitHub also makes it easy to write code, submit it as a PR, and that PR becomes the issue. That is a great GitHub feature. I do not see how do you treat it as something bad. It really makes no sense to create JIRA for adding tests, updating documentation. Julian>Lastly, another quibble about GitHub: that it allows people to squash, rebase and amend commits. The line-by-line comments quickly go out of date. squash, rebase, and amendments make JIRA comments go out of date at the same rate. However, in GitHub there's an explicit "resolve discussion", "reopen discussion" features, so there's a clear way to tell when the discussion is closed. JIRA has no conversation threads. GitHub has threads (code-related). Julian>GitHub makes it easy to review line by line, but you can then miss the important stuff. Do you mean we should stop code reviews? If we review code, then using both "JIRA and GitHub" makes it way easier to miss stuff. However, I agree we should review high-level ideas first, and put line-by-line comments only when high-level direction is OK. GitHub UI does have regular comments that are not attached to the diff lines, so you can comment there just like you add JIRA comments. Julian>And I like making holistic reviews What stops you from adding an overall review in GitHub? Julian>Then discussion can happen before coding starts. https://github.com/apache/calcite/pull/2628 is a perfect example here. The PR is trivial, and it adds a couple of tests. The PR looks good to me, except for method names. However, I truly do not know how to make test names better in Java, so I just ignore it. In Kotlin, one could use method name with spaces, so it would be better, however, I just ignored it. Then Julian comments "The test case names are not very useful". This should really belong to line-by-line diff rather than JIRA ticket. Now everyone should navigate between JIRA and PR to figure out the meaning. I see Julian suggests adding all the tests into a single method, however, I think we should just merge the thing and get over it. I am sure the discussion in CALCITE-4917 is not worth the time we spend there. We are discussing 2 lines of code there. Exactly two lines of test code. It would be OK to discuss method vs class when it comes to 100 lines. Even adding tests to JdbcTest does not have those levels of discussions. Vladimir
Re: Feedback on a generic return type version of RelShuttle?
Steven>Look at the examples in the wikipedia page you link to Let me please copy Wikipedia for you. I agree it is not that easy to spot, however, please pay attention to Car#accept method. class Car { ... @Override public void accept(CarElementVisitor visitor) { for (CarElement element : elements) { element.accept(visitor); } visitor.visit(this); } } The notable difference is that Wikipedia suggests that Car is in charge of enumerating Car elements. Of course, trivial car elements just call visitor.visit(this). However, compound elements should call the visitor and do re-composition or whatever. CarElementPrintVisitor in Wikipedia does **not** enumerate Car's elements. There's no CarElementPrintVisitor#visitElements In other words, if we want Visitor Pattern, we need to move RelShuttleImpl#visitChildren(RelNode rel) into the **default** implementation of RelNode#accept(RelShuttle) However, as I said, if we consider moving towards the proper visitor pattern, we might consider the approach listed in "From Object Algebras to Finally Tagless Interpreters" article above. Vladimir
Re: Enable "squash and merge" button in GitHub UI
I wonder if that has already been fixed. What do you think if I temporary enable the button, try it with https://github.com/apache/calcite/pull/2628 and then either keep the button or add a comment to asf.yaml? Vladimir
Re: Feedback on a generic return type version of RelShuttle?
Steven>Agree with James, and that's not even including implementations in other Steven>codebases that use calcite (e.g. there are dozens of implementations in Steven>Dremio's codebase). Can you please show RelShuttle in Dremio codebase? I see no RelShuttle implementations: https://github.com/dremio/dremio-oss/search?q=relshuttle Note: RelShuttleImpl does not count since RelShuttleImpl does not really need RelNode cooperation. RelShuttleImpl just enumerates children of the relations, and it does not rely on accept(..) calls. It can do the very same thing without calling accept(this), so we can remove RelNode#accept(RelNode) methods, and we can still get the same RelShuttleImpl. It can have a series of if (... instanceof) checks or a Map#get(getClass()) or even https://github.com/forax/exotic#visitor---javadoc Vladimir
Re: Feedback on a generic return type version of RelShuttle?
Steven, James, >there are 22 implementations of RelShuttleImpl(excluding tests) 14 of them are RelHomogeneousShuttle subclasses. RelHomogeneousShuttle redirects all customized visit methods into a single visit(RelNode) method. That means Calcite has only 6 usages (!) of the "visitor" feature, and the feature basically forces ALL RelNodes implement an obscure RelNode#accept. The method is ill-defined. Here's its javadoc: "Accepts a visit from a shuttle" It makes no sense at all. How should I implement this? @return A copy of this node incorporating changes made by the shuttle to * this node's children None of Calcite's RelNode implementations walk over children or something when implementing accept(RelShuttle). The only non-test, non-homogeneous subclasses are: ToLogicalConverter RelNodesExprsHandler SqlToRelConverter#convertSelectImpl # "attach hint to the first Hintable rel" RelOptMaterialization#tryUseStar ViewTable#expandView I do not know if those classes require any of the "visitor" features, however, it is clear that the half-visitor is complicating things more often than it helps. Vladimir
Enable "squash and merge" button in GitHub UI
Hi, I see "squash and merge" is not enabled in GitHub. I am going to enable it soon, so we can merge PRs as single-commit changes, and we no longer ask contributors to "squash changes" manually. I've no idea why the button was disabled. See https://github.com/apache/calcite/pull/2629 Vladimir
Re: Feedback on a generic return type version of RelShuttle?
I've no issues with making the existing visitor more generic, however, the choice of accept methods seems arbitrary :-/ Would it help if we move or prefer using algebras rather than visitors? https://oleksandrmanzyuk.wordpress.com/2014/06/18/from-object-algebras-to-finally-tagless-interpreters-2/ Just to clarify: 1) I would like to refrain from breaking compatibility, especially if we add an abstract method that is used in 1% of Calcite calls. If you still want to break backward compatibility, it would be great if you could show at least one use case for the change. Otherwise, it would end up a breakage without even having a use case for it :-( No kidding. I think RelShuttle is not really useful, and we could just drop all RelNode#accept methods and lose virtually no features. Adding generics on top of RelShuttle is not going to heal that. 2) The current RelShuttle does not seem to add much value. In other words, all implementations of accept(RelShuttle) method are trivial. The example from Wikipedia suggests: https://en.wikipedia.org/wiki/Visitor_pattern#What_solution_does_the_Visitor_design_pattern_describe? Wikipedia>Define a separate (visitor) object that implements an operation to be performed on elements of an object structure Wikipedia>*Clients* *traverse *the object structure and *call a dispatching operation accept* (visitor) on an element I think the naming or the implementation in Calcite is wrong. We should either stop naming such things as RelShuttle and RexVisitor "visitors" or we should incline to the canonical approach. Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>A bigger problem around our process that I see is not enough reviews/triage for the contributions we do receive. We have 224 open prs and I'd estimate a median open time of 1 year+ I am sure JIRA is slowing us down here. If we accept we do not need JIRA, PRs could become the way to go for the reviews. Jacques, I see you've just created https://github.com/apache/calcite/pull/2625 (generic visitor), and you followed the exact pattern I suggest: you filed a PR without JIRA, and you put all the clarifications into the description. Congratulations. It confirms JIRA adds no value to our workflow. Your https://github.com/apache/calcite/pull/2624 (introduce handlerbased...) is less fun. Why did you create JIRA? ;) It could have been a JIRA-less PR as well. As you put description to the PR itself, it becomes easier to review. That is what I mean. We rarely reject contribution or ideas, and, I believe, we could reduce time-to-merge by lowering the barrier and allowing PR-first contributions. Let me put it as follows: in many cases committers and experienced contributors know they are doing good. For example, I see you are moving towards AOT. That is fine, and I do not really need all those JIRA's. It would be so much easier to review and follow if you created a single Issue that references all your AOT PRs. It is not really fun to create tickets for moving to JUnit5. We could just do it instead of spending time on JIRA comments. >I'd estimate a median open time of 1 year+ I suggested moving "core test framework classes" from core/src/test/ into its own test module long ago. My mistake was I sent an email instead of just sending a PR. The response to my mail was strongly negative, however, recently I did the split, and it works great. >This seems to solve a problem that doesn't exist It does exist. My current mail highlights it right away: you could submit JIRA-less PRs, everybody would have a single page for reviews, and so on. It is really sad to hear "does not exist". JIRA causes excessive mails: "jira created", "pr created". In case we skip JIRA completely, it would be less notifications for everybody. Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
Stamatis>Quite interesting information so far One more data point: LLVM is migrating issues from their Bugzilla to GitHub Issues right now (52K issues!) See https://llvm.discourse.group/t/bugzilla-migration/4848 Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>I know people/companies who have built tools/plugins >around JIRA for getting useful information such as which internal forks >have this particular fix, etc That is true. However, I do not know what should we do about it. Of course, any change brings hidden costs, so if the benefits are unclear, there's not much point in making changes. On the other hand, the more I discover, the more I am sure GitHub Issues would be great for Calcite. I started the discussion like "it looks we are fine, but we can try GitHub", and now I am quite enthusiastic that moving to GitHub would bring multiple improvements in various aspects. >I turn to JIRA >or GitHub and it turns out that I am able to find the information easier >with JIRA. GitHub issues are searchable, including comments. We currently "enforce" JIRA tickets, so PRs lack descriptions and explanations, so it is expected you can't find them. Try using JIRA to figure out when we bumped Guava to 31. There's no JIRA for that. On the other hand, GitHub search returns PRs, code files, and commits that match the search. https://github.com/apache/calcite/search?q=guava+31 Then, if you know Git SHA (e.g. from "git blame"), you can find PRs related to that SHA. For example https://github.com/apache/calcite/search?q=890eb61ef486e2192110cefe4cac5aa6f150=issues Frankly speaking, I never thought of trying to search via commit SHA, but it is a definitely cool trick I learned when reading https://docs.github.com/en/search-github/searching-on-github/searching-issues-and-pull-requests When it comes to versions, GitHub PRs and issues would have "milestones" (~ "fix version"), and when you browse commits, GitHub shows the list of tags that include this commit. For instance, if you open https://github.com/apache/calcite/commit/2326cf8dc97f473d0efa2b0e4883fef87a40746b , you know Guava was updated to 31 since Calcite 1.28.0 >One unpleasant situation which comes up from time to time is when >discussions develop in both places. However, migrating to GitHub issues >will not completely solve this since we will still need to navigate back >and forth between issue and PR If we allow PR-first (no issue, no JIRA, just PR with the proper description) contributions, then: 1) It will be in a single place, no need to switch 2) Navigation between issues and comments would still be easier in GitHub than commenting like "please continue in JIRA" GitHub allows responding to issues/PRs via mail, and I find it a nice feature. I don't use it often, however, I use it from time to time, and I see others using it. >Moreover, many Apache projects are using >JIRA already which makes it easy to perform cross project search GitHub has an out-of-the-box way to scope search for organizations. Here's a search for Calcite in the Apache organization: https://github.com/search?q=org%3Aapache+calcite At any point in time, you can remove org:apache, and you get GitHub-wide search. GitHub hosts more projects than ASF JIRA, so being able to search and link across them (all, not just ASF projects) would be useful. >I find JIRA search superior to GitHub both when it comes to approximate >text search and exact filters I might be wrong here, however, I think we do not really use JIRA metadata. In most cases, we use only the ticket title, description, and the fix version. There's "affects version" in JIRA, and I do not know if GitHub has a one-to-one replacement for that. However, I would be fine if the affected version was in the issue/PR description. "component" (~core, linq4j, cassandra, ...) could become a label in GitHub. AFAIK the limitation is that labels can only be changed by committers or collaborators. We can mitigate that by: a) Adding collaborators (via asf.yaml) b) Adding committers c) Adding GitHub Actions that label Issues based on descriptions or even PRs based on the set of touched files. GitHub's search features are more than enough for us. We have no fancy metadata for complicated searches :) Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>I think the easiest thing is to focus on the >"workflow" for each "persona" we have While I agree the questions you raise are valid, I would say, that moving to GitHub Issues does not really change the workflow except in certain small cases. I doubt there's an exhaustive list of answers for JIRA. No way I want to bomb you with answers, however, I truly believe replacing JIRA with GitHub is not really disrupting for Calcite. On the other hand, moving to GitHub issues would likely bring new contributors and reviewers. >Writing down what you believe to be "allowed" and making sure everyone >else is on the same page is perfect. https://calcite.apache.org/develop/#contributing is more or less relevant if you replace JIRA with GitHub Issue. >what would a contributor >do on Github to provide a patch for a bug they found? Like with the current workflow: they file a PR. They don't have to file a JIRA first. They just file a PR that explains the problem and suggests the fix. >What about a contributor reporting an issue? a) If they are not sure, they can file an issue at GitHub b) They can ask on a dev@calcite mailing list c) If they can't understand how to use Calcite APIs, they could file GitHub Discussion (if we enable it). It might be useful to distinguish "bugs in Calcite" vs "user questions regarding the way to use Calcite". d) If they have a reproducer, they could just file a PR that reproduces the issue. It would immediately show up as a build failure, and everybody would be able to see how it breaks. >How should a developer structure a "big" >change to Calcite which might contain multiple commits? a) They can file PR that contains several commits (like we currently do sometimes) b) They can create an issue and reference the relevant PRs/issues in the description "tasks lists" is useful here: https://docs.github.com/en/issues/tracking-your-work-with-issues/about-task-lists ^^ the above answers are pretty much the same as our current workflow except JIRA is replaced with GitHub issues. >Where does a >release manager look to determine if a release is ready (all necessary >issues are fixed)? Typically, when planning a release, we ask everybody to figure out which issues are "blockers" for the release. With GitHub, we can associate the relevant issues/PRs with the milestone. Here's how "open items" look in Airflow: https://github.com/apache/airflow/milestones/Airflow%202.2.3 The release manager opens the milestone, and they see if it is all ready. Vladimir
Re: calcite-core 1.28.0 now depends on kotlin-stdlib?
>since core/src/kotlin does not exist yet. I mean core/src/main/kotlin does not exist yet Vladimir
Re: calcite-core 1.28.0 now depends on kotlin-stdlib?
If it is a transitive dependency, then it is a transitive dependency. Otherwise it is a bug. I'm sure calcite-core itself does not use kotlin stdlib since core/src/kotlin does not exist yet. Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>All that said, this was a number of years ago I agree. GitHub makes noticeable progress nowadays. >I'd want to see that demonstrated before >changing all of Calcite over. How do you want that demonstrated? Apache Airflow does use GitHub Issues, they do ship releases. Many top-starred Apache repositories use GitHub issues: https://github.com/orgs/apache/repositories?qstargazers Do you think Calcite has something unique in the release process that makes GitHub infeasible? Apparently, issues and PRs can be assigned to a milestone. Apparently, committers can close the milestone. >I'm a little confused in your original message, you have two questions. >Are you suggesting that Calcite would use both Jira and Github issues? No way. I think we should move all issue management to GitHub. I just thought that questions like "should we keep JIRA" or "should we move to GitHub" might provoke slightly different discussions with helpful comments even though it is basically the same question. >Should PRs always have issues or is a PR sufficient >for inclusion in a release? I believe, the current development workflow allows: * PR without JIRA * commits without PR * commits without PR and JIRA (just commits) It is up to the committer if the change is significant enough to have JIRA or PR or whatever. >GH issues created a black hole where anyone could create issues and >they could just get "lost" Nothing prevents the issue from being "lost" in Calcite JIRA :-/ Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>I haven't found navigating JIRA + GitHub PRs to be a significant >problem, but if others think it would help GitHub has a single search field, so you can search for both PRs and issues at the same time which is nice as well. >will my GH comments end up as coming from @michaelmior If we migrate issues, then they will be created by a "bot" account. Of course, we can add comment author in the comment itself. Here's how Coq migrated their issues: https://github.com/coq/coq/issues/2268 We can figure out which accounts do we want to use (e.g. create email -> account mapping file). Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
Here's Airflow vote thread (~2020): https://lists.apache.org/thread/s5nk0mgblkzq9jrqgrvp12yn4kmy0o7b here's the discussion: https://lists.apache.org/thread/6lfth3o4gzk3s891bcqy045z5gl4fs07 I believe draft PRs could avoid "DISCUS" threads that suck time from everybody and those DISCUSS threads add nothing to the code :-/ For instance, Julian has recently launched "Refactoring tests". I think it would be just fine if Julian pushed a draft of the intended changes, we just added +1 and went forward. Just to clarify: the whole idea of moving to GitHub issues for Calcite started when Apache JMeter team started a discussion regarding moving from Bugzilla to GitHub Issues. Currently, I think that JIRA for Calcite is like learned helplessness. Everybody knows "JIRA is required", however, nobody questions that. I do not say JIRA is bad. I just say 99% of code changes happen in GitHub, so unifying issues and PRs would be a big plus. Airflow team mentions that having a JIRA issue for each PR adds unnecessary overhead. I believe in many cases we could be just fine if we have everything right in the PR itself. For example, do we really need to create JIRA issues like "[CALCITE-4836] Upgrade protobuf-java 3.6.1 -> 3.17.1"? I believe filing a PR is just enough In many cases, having a proper comment in PR helps *A LOT* for the reviewer. https://github.com/apache/calcite/pull/2615 is a perfect example. PR 2615 does have a description that clarifies the reason for the change. It is just fine without JIRA. In many cases, I tend to copy descriptions between JIRA and PR because it does help for the review: it is much faster to switch between "conversation" and "files changed" tabs than to load the corresponding JIRA. Here's an example: https://github.com/apache/calcite/pull/2410 Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
>Mark the fix versions of each JIRA issue. It is visible for users to >find which version contain this functionality or fix this bug. GitHub has "milestones". See how it is used in Airflow: https://github.com/apache/airflow/milestone/45?closed=1 >2. Watch the interested issue so I could receive email once this issue >has been changed. With GitHub you can subscribe to the issues or PRs you like, and you get the notifications. >3. Relate the current JIRA issue with it's related JIRA. Here's an example: https://github.com/apache/airflow/issues/19788 Note how the following appears in the #19788 (so cross-links like Issue-Issue, Issue-PR, PR-PR, and so on work) issues: >uranusjr mentioned this issue 6 days ago >Adjust built-in base_aws methods to avoid Deprecation warnings #19725 Here's how the mention look like: https://github.com/apache/airflow/pull/19725#issuecomment-977078128 Note: GitHub has a beta program for Projects at the moment which would improve issue management way further: https://docs.github.com/en/issues/trying-out-the-new-projects-experience/about-projects Vladimir
Re: [DISCUSS] JIRA vs GitHub Issues
We can copy all the issues, including resolved ones. We can import issues. GitHub has api for importing without notifications. Infra team can execute such a script (we need to find it) Vladimir
[DISCUSS] JIRA vs GitHub Issues
Hi, What do you think of moving to GitHub Issues for Calcite? Currently, my (Calcite) development workflow focuses on pull requests. That is all contributions I see come via pull requests. At the same time, issues are hosted in JIRA, which creates friction: PR merging requires changes to both GitHub and JIRA. I guess it would be easier if issue management was in GitHub as well. Then, I believe, co-locating issues and PRs would make external contributions easier. The links between Issues and PRs would be easier to navigate. There's a possibility to enable GitHub discussions as well (see https://github.com/apache/airflow/discussions ). Co-locating Issues, and Discussions look promising. I think it is not that painful, however, GitHub has a limitation of 20 collaborators (non-committers who want to assign, edit, close issues, and pull requests): https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-AssigningexternalcollaboratorswiththetriageroleonGitHub I have no strong opinion, however, I just realized having issues in GitHub would ease friction. Any thoughts? Just to gather opinion: -1..+1 keep using ASF JIRA for issue management -1..+1 migrate to GitHub Issues Here's my vote: +0.5 keep using ASF JIRA for issue management. It more-or-less works, however, merging PRs and navigating from PR to issue is far from perfect. +1 migrate to GitHub Issues. It would simplify navigation between issues and PRs, and I believe it would reduce friction for external contributors. In theory, GitHub Issues can be automated with Actions (e.g. labels). I'm not sure if we need that, however, it might be useful. Vladimir
Re: [DISCUSS] Apache Calcite Online Meetup January 2022
>Are the people willing to give a talk around Calcite? I may present a talk regarding Gradle, and the upcoming migration to the idiomatic structure: https://issues.apache.org/jira/browse/CALCITE-4832 https://github.com/apache/calcite/pull/2566 https://docs.gradle.org/7.3/userguide/structuring_software_products.html >What do people think in terms of timing? I think the new build structure is more like 30-45 min talk Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Jacques>Based on this, I would suggest we do a small Jacques>modification around CI on Calcite whereby pre-commit tests do not include Jacques>running against Avatica main and instead we have a post-merge CI job that Jacques>monitors the status of connecting the two Suppose PR verification passes. Suppose I merge the PR. Then "post-merge CI job fails". Who resolves the issue? I'm strongly against moving the burden of such maintenance to committers. The newly added code should not contain known bugs. Julian>CI of Calcite master against Avatica master SHOULD pass but not MUST The issue is that people rarely watch for the status of "allowed failures". The PR verification either succeeds or fails. If it succeeds, then you are done. No one would and no one should open "non-mandatory verifications" each and every time. It is like @Disabled tests. Once the test gets ignored, people rarely revisit it, even if they promise to heal it in a couple of weeks (see CALCITE-3457). So the earlier we detect the incompatibility, the easier it is to fix. Jacques>My high level observations The survey is biased, so are the outcomes. There's a question regarding "separate release cycles". How that question contributes to "should we merge ..."? Why don't you ask the other questions raised in the thread? For instance: * How many Avatica versions should Calcite support?" * Have you ever used source release artifacts (*.tar.gz) outside of "release voting"? "never", "Calcite", "Avatica", "other ASF projects" * Would you prefer voting and testing on Calcite and Avatica separately or would you prefer to have a single release candidate? * Is it important to be able to commit Avatica changes while Calcite is being released? * Would you prefer having different release processes for Avatica-java and Calcite, or would you prefer to have a single sequence to release both of them? * Is it important to be able to open both Avatica and Calcite in a single IDE for making changes and running tests? "nice to have", "don't care", "not important" * Is it important to be able to file PRs touching Avatica and Calcite at the same time? "nice to have", "don't care" * Is it important to have the same CI and build scripts among Avatica and Calcite?: "same scripts please", "don't care", "different build scripts (or even build systems) are fine" * Is aligning test framework code between Avatica and Calcite (e.g. junit, hamcrest versions) important? "nice to have", "don't care" * Is it important to have Avatica and Calcite as separate artifacts at dist.apache.org? "nice to have", "don't care" * Is it important to be able to separate issue and PR notifications for Avatica and Calcite? And so on. Vladimir
Re: [DISCUSS] Refactoring tests
Jacques>This sounds like it will mean we will need to make calcite-core test artifacts available Test artifacts publication is yet another anti-pattern just like "base test class". This change has been discussed: https://lists.apache.org/thread/fz96p94h016p11g777otqntjxg2oxgh1 If you want to depend on a class from tests, consider moving it to /testkit module: https://github.com/apache/calcite/tree/0899e6c157632ba1c5369a942cfe2be15fb4ed9f/testkit Jacques>We should think about the rules around Kotlin What happens in calcite-core/tests stays in calcite-core/tests :) It is reasonable to assume that testkit module would have dependencies, and testkit would provide API that is usable from Java and other JVM languages. In that regard, Kotlin dependency in testkit is not much different from Quidem or commons-lang3. Consumers might use Quidem if it fits just like they could use Kotlin if it fits. Vladimir
Re: [DISCUSS] Refactoring tests
Julian, Thanks for taking time, and working on this. "Fixture" is a too broad term, and I do not mean we should replace it completely with AssertJ. What I mean is that AssertJ allows creating *discoverable* assertions. Hamcrest assertions are hard to create and hard to discover in IDE. --- Re test code implementations, we junit5 extensions would likely be useful. For instance @ExtendWith(CalciteParserExtensions.class) @WithLex(BACK_TICKS) class MyParserTest { void invalidSqlFails(SqlParserFixture sql) { // <-- injected by extension . In other words, it does not mean fixture (whatever it means) must be 100% junit5. What I mean, junit5 extensions could simplify writting calcite tests by making fixture confuguration declarative, and injecting the configured object as test method argument. Vladimir
[jira] [Created] (CALCITE-4889) Double join is created for NOT IN
Vladimir Sitnikov created CALCITE-4889: -- Summary: Double join is created for NOT IN Key: CALCITE-4889 URL: https://issues.apache.org/jira/browse/CALCITE-4889 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.28.0 Reporter: Vladimir Sitnikov The following queries yield several joins in the plan. I think double joins are excessive here, especially for the first case where all the values are non-nullable. {code}select * from "scott".emp where (empno, deptno) not in ((7369, 20), (7499, 30));{code} {noformat} select * from "scott".emp where empno not in (null, 7782); EnumerableCalc(expr#0..12=[{inputs}], expr#13=[0:BIGINT], expr#14=[=($t8, $t13)], expr#15=[IS NULL($t12)], expr#16=[>=($t9, $t8)], expr#17=[AND($t15, $t16)], expr#18=[OR($t14, $t17)], proj#0..7=[{exprs}], $condition=[$t18]) EnumerableMergeJoin(condition=[=($10, $11)], joinType=[left]) EnumerableSort(sort0=[$10], dir0=[ASC]) EnumerableCalc(expr#0..9=[{inputs}], proj#0..9=[{exprs}], EMPNO0=[$t0]) EnumerableNestedLoopJoin(condition=[true], joinType=[inner]) EnumerableTableScan(table=[[scott, EMP]]) EnumerableAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0)]) EnumerableValues(tuples=[[{ null }, { 7782 }]]) EnumerableSort(sort0=[$0], dir0=[ASC]) EnumerableCalc(expr#0=[{inputs}], expr#1=[true], proj#0..1=[{exprs}]) EnumerableValues(tuples=[[{ null }, { 7782 }]]) {noformat} {noformat} select * from "scott".emp where (empno, deptno) not in ((1, 2), (3, null)); EnumerableCalc(expr#0..14=[{inputs}], expr#15=[0:BIGINT], expr#16=[=($t8, $t15)], expr#17=[IS NULL($t14)], expr#18=[>=($t9, $t8)], expr#19=[IS NOT NULL($t11)], expr#20=[AND($t17, $t18, $t19)], expr#21=[OR($t16, $t20)], proj#0..7=[{exprs}], $condition=[$t21]) EnumerableMergeJoin(condition=[AND(=($10, $12), =($11, $13))], joinType=[left]) EnumerableSort(sort0=[$10], sort1=[$11], dir0=[ASC], dir1=[ASC]) EnumerableCalc(expr#0..9=[{inputs}], proj#0..9=[{exprs}], EMPNO0=[$t0], DEPTNO0=[$t7]) EnumerableNestedLoopJoin(condition=[true], joinType=[inner]) EnumerableTableScan(table=[[scott, EMP]]) EnumerableAggregate(group=[{}], agg#0=[COUNT()], agg#1=[COUNT($0, $1)]) EnumerableValues(tuples=[[{ 1, 2 }, { 3, null }]]) EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) EnumerableCalc(expr#0..1=[{inputs}], expr#2=[true], proj#0..2=[{exprs}]) EnumerableValues(tuples=[[{ 1, 2 }, { 3, null }]]) {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: DISCUSS: merge calcite-avatica and calcite repositories
Josh>I am still upset from the porting of Avatica to Gradle Please migrate Avatica to whatever build system you like. I disagree with you a lot, however, I stop commenting as I do not see it might change your mind. I'm not touching Avatica anymore. me>I am going to delete/ignore the offending tests in Calcite in 24 hours Done. I commented out the test. I hope we no longer add tests to Caclite that verify low-level Avatica behaviour. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
s/John/Josh/ sorry for the typo. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Josh>because I genuinely do not see how this approach Josh>creates more code that needs to be written than if the projects were in Josh>the same repository. Compare https://github.com/apache/calcite-avatica/blob/4cf769f8ee32bb3520604e4f3284e6f233f90276/build.gradle.kts and https://github.com/apache/calcite/blob/6d51d277b158ff7073f29ada4a96a7a74c0b46fc/build.gradle.kts Those are duplicate files, and I had to create them twice. If the code was co-located in a single repository, then a single copy of the file would be enough. John>(I am to assume) is a personal motivation for yourself. Everybody who contributes to the build scripts, dependencies, etc, would want to avoid duplication of their work. John>Avatica 1.17.0 was released Jun 2020, 1.18.0 in May 2021. Calcite had 3 John>releases of its own between these. In other words, Avatica could have three releases **for free** Of course, you might argue there's "non-trivial procedure for releasing Calcite", however, I would claim it does not have to be that way. "lots of manual steps for the release" add zero to the quality of the release. Of course, we might disagree here as well, however, if releasing Calcite is painful, then "forking Avatica into a different repository" is a wrong solution. John>I can still build Avatica with Gradle 6.8.1. Go ahead and try building Avatica with Java 17. You will fail quite fast. Have a fun day. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Josh>I have no objections to combining these two repositories together. Why don't we just merge the repositories and move on then? Josh>If we identify these problems, we can think Josh>through (from the beginning) if combining repositories is the only Josh>solution or just one possible solution (and weigh the pros and cons for Josh>each). I am sure I did explain the pain points. See points 1..5 (the very beginning of the mail) in the very first message in this thread. If that is not enough, I can't really explain better. CI configuration is duplicated across repositories. JUnit versions maintenance overheads are duplicated. Build configuration is duplicated (e.g. code style rules, boilerplate build scripts, etc) Cross-repository PRs are non-trivial. And so on. If you have suggestions on how to bump JUnit and CI configuration in both repositories at once, you are welcome, however, the only alternative "solution" I expect is to ignore the problem and treat it as "not a problem". I acknowledge merging repositories will make some use cases harder. However, I am sure points 1..5 (see the very first mail) have no solution besides repository merge. It is clear that having the repositories separate has no real benefits. Community is the same, committers are the same, language is the same, releases are co-located. There are theoretical reasons like "different repositories allow different committers in theory" or "different repositories allow different release schedules" or "because people perceive that they can use component A without using component B" However, I am sure those "reasons" have no weight. They are theoretical, and some of them turned out to be false. For instance, the release schedule is co-located across several years. Even at the time of the Avatica split, somebody said that Solr community did **merge** repositories after a while exactly because the committers were effectively the same. Of course, somebody might **believe** that "because people perceive that they can use component A without using component B" is way more important than a non-important duplication of maintenance of CI, dependencies, build scripts across both repositories. You know what? I am fine to accept people might have those beliefs. If so, I just let them do the maintenance. Of course, you might claim that your ability to watch "avatica PRs only" is far more important. Is that really the key point of having separate repositories? Really? GitHub has "code owners" and "labeling" features, so you can watch the relevant parts of the repository. Josh>These repositories are separate because they have separate release Josh>schedules The release dates for Calcite and Avatica are co-located (see my mail above). The feature of "separate release schedules" is not really used in practice. Having the same version number for Calcite and Avatica would event make it easier for clients to pick the compatible versions. Josh> The net amount of code you have to write doesn't Josh> change with how it works now compared to how you are suggesting it Josh> should work. This is false. For instance, "CI configuration, JUnit versions, build scripts" are **duplicated** across repositories, so if the repositories are different, then the amount of work is increased. Here's one more data point: Calcite is built with Gradle 7.3, and Avatica is stuck with Gradle 6.8.1 Of course, the duplication of work does not exist. You say "the net amount of code I have to write doesn't change". You are right. I won't contribute to calcite-avatica repository anymore, so there's no duplication of work for me. Josh>If there are breaking changes in Avatica, they would need to be Josh>accounted for when Calcite is bumped to the next version of Avatica. calcite-avatica/pull/161 would be way easier if both avatica and calcite were in a single repository. There would be no need to think of version compatibility, etc. Josh> Having a separate repository makes it Josh>extremely easy for me to watch Avatica commits/pull-requests which I am Josh>capable of reviewing vs. Calcite pull-requests which I am not Josh>comfortable reviewing. Have you considered mail filters and/or code owners https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners ? There's an option to automatically label PRs according to the modified files via GitHub Action. Josh>rather than starting from the root problem "How can we Josh>make co-dependent changes easier?" "co-dependent" changes are not the only problem. It was just a trigger. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Ok, I did not care when Avatica was split into its own repository, and I assumed Avatica team would maintain the code. That turned out to be not quite true, and it turns out there are non-trivial overheads coming from the split. Of course, everybody can disagree, so if you disagree, you don't need to explain that. We just disagree and that is it. Now I do not care how Avatica would evolve further. It might be even good because Vladimir S won't bother Avatica with mails, PRs, and other random items. What I care much more about is that Calcite tests are broken, the CI reports failure for every commit, and that is really insane. Calcite CI must not fail on every build. I am going to delete/ignore the offending tests in Calcite in 24 hours, and I let the others deal with the case (fix tests, fix Quidem, merge Avatica, or whatever). I would no longer review calcite-avatica PRs. Julian>I was release manager for Calcite and Avatica this time around and it took me about a week Julian, as the repositories are merged, there will be a **single** release process that releases both Calcite and Avatica in a single go. It would reduce the friction, and it would reduce cases where the release procedures and steps are "slightly different". It looks like you assume the release steps would never change, however, in practice, the drift of Java, Docker, OS, and so on versions would affect the release steps anyway. That is why having Avatica and Calcite in a single repository would simplify the release procedure. Julian> And now we’re going to change the process again? I propose to *drop* the release process for "calcite-avatica-java". The release of "calcite" would release both things (in a single calcite-x.y.z.tar.gz file), so the release steps for Calcite would be intact. Technically, that changes the process. In practice, I find that you overreact. The merge would make the release easier. On top of that, it was you who voted for splitting Calcite and Avatica, so it was your conscious decision to accept that "release management would get harder after the split" Julian>Any supposed “time savings” due to the merged repositories will be eaten up by dozens of small issues that will crop up long after the merge The words are not justified. There will always be some random small issues, even in the case we keep repositories separate. On the other hand, there are true issues when repositories are different: * We have two different release processes, so RM have to know the differences * we have slightly different dependency versions, so things like "bump junit" have to be duplicated * CI configuration is duplicated across the repositories, and it does have non-zero maintenance overhead * cross-repository refactorings are harder to do (like the fix for the exception message) Julian, I did put four issues that are real, and they induce noticeable maintenance churn. Java and JUnit will keep updating and you can't prevent that. Please highlight the problems exactly and please stop saying "time savings will be eaten by dozens". That is not a healthy discussion. >Also, we will lose the history of the Avatica source code. All of the commit hashes for bugs fixed long ago will change We can make calcite-avatica repository read-only, so all commits will be there. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Michael> is not proposing to change the Michael>structure of modules within both projects, merely to have the code for Michael>both in a single repository. I propose to integrate them into a single build, and keep the set of the published jars. However, the modules and dependency structure could be kept as is. We might want to rename folders like calcite-core calcite-linq4j .. avatica-core avatica-server However, I am not sure it is that important to discuss folder names now. The idea is that as you "open Calcite in IDE, you see both Avatica and Calcite modules" Michael>Is there any reason we couldn't have a separate release schedule if Michael>both projects are in the same repository? A different schedule means Calcite must support at least two different Avatica versions. In other words, if we allow clients to update Avatica at their will, then we should allow them building Calcite with different Avatica versions, which implies Calcite test code should succeed for multiple different Avatica vesions. That makes it harder to write tests: we have to execute tests with two different Avatica releases (or even more than two). There are at least two sources for complexity: a) We have to write tests that tolerate multiple versions. For instance, "if (avatica.18+) {...}" and so on. That is not really trivial, especially taking into account some of the tests are created with non-yet-popular technologies like Quidem where you can't really google solutions. So the "trivial" task of "making a test to expect two possible outcomes" becomes less trivial as you try to pass the version from GitHub Action to Gradle to JUnit to Quidem to no-one-knows-which class. If we support one Avatica version only, that is not needed. We just patch the test in Avatica and Calcite and that is it. Single repo avoids "Gradle vs Quidem" dance. b) If we claim that we support 5 different Guava versions, 3 different JDK versions, 2 different Avatica versions, then we have to execute 5*3*2 = 30 combinations of the tests. That is not really a full matrix, however, things get way easier if we support one Avatica version only. The amount of tests we need to do during a proper release is much less, and it is easier to commit changes that touch Avatica and Calcite at the same time. Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
Francis>I remembered joining this list a few years back then Francis>- Calcite - SQL engine used to build the SQL part of your database. The things is "apache/calcite" is tons of features as in "thing that converts SQL to MongoDB" or "thing that executes SQL queries over InnoDB data files", or "thing that validates SQL", or "linq4j". There are lots of features, and there are lots of artifacts. The set of modules is not perfect, and there are cases when somebody needs "just optimizer", and there are valid reasons to hate "Enumerable" or "dynamic bytecode generation". However, it looks like having Avatica in the same repository does not hurt much as long as the other features do not impact Avatica features much. Francis>Each should be able to stand up on its own and have its own independent release schedule Having a different release schedule means we must support at least two different Avatica versions in Calcite. That adds overhead on Calcite development as we have to make test code to account for different Avatica. Francis>I've seen PRs that need to touch both repositories for Francis>a change and it seems like a code smell to me Even though testing the "exact exception message" at Calcite side looks weird, it is important to test how the exception would look like for the end-user, so there must be an integration test that ensures Avatica produces proper exceptions, and Calcite does not make them bad (e.g. by ignoring the message). Francis>own independent release schedule For now, Calcite Avatica is stuck with Gradle 6 since it uses "dependencies between test source sets". In other words, test classes in calcite-avatica/server depend on test classes in calcite-avatica/core. That is an anti-pattern, and the resolution is to move the common test classes into a proper artifact with its metadata and dependencies. This is something I fixed in Calcite in https://issues.apache.org/jira/browse/CALCITE-4821 Here's the question: who would do the relevant work for calcite-avatica? Vladimir
Re: DISCUSS: merge calcite-avatica and calcite repositories
I know the below is too wordy, however, English is not my native language, so I tend to overexplain and duplicate thoughts. Julian>To allow separate communities This is something I do not understand. Let me be explicit: I am OK to maintain bits of avatica code when Calcite fixes overlap with Avatica for some reason (e.g. throwing exceptions). I am not OK with duplicating the same work just because someone else wants to have a separate repository. I was maintaining and releasing multiple (four!) repositories for pgjdbc when it was using Maven (Maven was not able to release jre6, jre7, and jre8 artifacts from within a single repository), and it was a big relief when the code was recollected into a single repository again. That is why I am advocating for merging avatica code back. Julian>Splitting code into modules makes it easier to continue to splitting I agree it is worth having calcite-core and avatica as separate jar artifacts. It is even worth splitting "enumerable" out of calcite-core into its own module. However, I do not see how splitting the repositories helps. I do not expect somebody to use calcite-avatica as a git submodule or something like that. Julian>to allow separate release schedules Here are the recent releases. There's often a pattern that "Calcite release follows Avatica release". It is even mentioned in the mails where someone suggests releasing Avatica, then releasing Calcite". A notable exception was rel/avatica-1.16.0 which was not connected to the release of Calcite. The rest Avatica releases seem to be quite close to Calcite releases, so I do not see what do we gain from "separate release schedules". It looks like co-locating the release would be easier for both maintenance overheads, testing, and consumption. 2016-03-16 rel/calcite-avatica-1.7.0 2016-03-17 calcite-1.7.0 2016-03-15 rel/calcite-avatica-1.7.1 2016-06-01 rel/calcite-avatica-1.8.0 2016-06-08 calcite-1.8.0 2016-09-17 calcite-1.9.0 2016-10-07 calcite-1.10.0 2016-10-27 rel/calcite-avatica-1.9.0 2017-01-03 calcite-1.11.0 2017-03-20 calcite-1.12.0 2017-05-23 rel/avatica-1.10.0 2017-06-22 calcite-1.13.0 2017-09-27 calcite-1.14.0 2017-12-05 calcite-1.15.0 2018-03-06 rel/avatica-1.11.0 2018-03-14 calcite-1.16.0 2018-06-21 rel/avatica-1.12.0 2018-07-16 calcite-1.17.0 2018-11-29 rel/avatica-1.13.0 2018-12-18 calcite-1.18.0 2019-03-20 calcite-1.19.0 2019-04-25 rel/avatica-1.14.0 2019-05-09 rel/avatica-1.15.0 2019-06-19 calcite-1.20.0 2019-09-06 calcite-1.21.0 2019-12-19 rel/avatica-1.16.0 2020-05-23 calcite-1.23.0 2020-06-22 rel/avatica-1.17.0 2020-07-23 calcite-1.24.0 2020-08-22 calcite-1.25.0 2020-10-06 calcite-1.26.0 2021-05-18 rel/avatica-1.18.0 2021-06-03 calcite-1.27.0 2021-10-04 rel/avatica-1.19.0 2021-10-19 calcite-1.28.0 Julian>The last of these was particularly on my mind when we split Avatica from Calcite. Julian>I was pleased to see, for example, Apache Phoenix using Avatica successfully (including building an ODBC driver) Does it really help Phonenix that we split repositories? I am sure they should not really depend on the repositories. How would their life be harder if we release the same set of artifacts and jars from a single source tree? Julian>If Avatica had remained part of Calcite, both written in Java and using the same build system and release process, Julian>it’s less likely that Avatica-go would have happened Does Avatica-go use calcite-avatica repository? So far it looks like Avatica-go uses https://github.com/apache/calcite-avatica-go, and I do not suggest merging it into calcite repository yet. I do not see how merging calcite-avatica and calcite repositories would block the evolution or emergence of avatica-go, avatica-rust, etc. Julian>release process I identified https://issues.apache.org/jira/browse/CALCITE-4881 "Calcite release tags should have rel/... prefix as per the ASF requirements" It looks like an unintentional breakage in https://github.com/apache/calcite/commit/2e30293af7373b6c5fbcc5fa6505b49df2fba000 caused by the split. Vladimir
[jira] [Created] (CALCITE-4881) Calcite release tags should have rel/... prefix as per the ASF requirements
Vladimir Sitnikov created CALCITE-4881: -- Summary: Calcite release tags should have rel/... prefix as per the ASF requirements Key: CALCITE-4881 URL: https://issues.apache.org/jira/browse/CALCITE-4881 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov As per "Git status update" mail from Infra, release commits should be tagged with rel/... tags which are protected at the Infra side so they can't be overwritten, removed, etc. See sample: ACCUMULO-4106 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: DISCUSS: merge calcite-avatica and calcite repositories
Currently CI for avatica verifies building with calcite/master, and vice versa. Otherwise we won't notice integration issues before we release. Julian suggested that we should support one avatica version only. I agree, especially of that means we merge the repositories back. > I think it is a mistake to build against master We expect certain level of backward compatibility, so there are reasons to build with latest commits. However, if the versions are always aligned in practice, then having a single repo seems reasonable. One more solution is to disable/remove the offending test in Calcite. Merging the repos still make sense for me, hovewer, I might miss some use cases. Vladimir
DISCUSS: merge calcite-avatica and calcite repositories
Hi, Currently, we have calcite-avatica and calcite in different repositories. Frankly speaking, I do not know what it brings, however, it does create points of friction: 1) If a feature touches Avatica and Calcite, then PRs are hard to create and maintain. We just can't create a single PR across both repositories 2) If we support a single Avatica version only in Calcite, then the point of having different repositories is even mooter. 3) CI configuration is basically duplicated: every time we want to add a new JDK (once every 6 times), we have to do it twice 4) There are common dependencies: JUnit, hamcrest, etc, etc. We basically have to do the same thing multiple times when upgrading versions in avatica and calcite 5) Adding @Nullable annotations to Calcite was more complicated than I wanted because Avatica is stored in a different repository. I basically had to create a bunch of astub files instead of just putting the relevant @nullable annotations on top of Avatica classes: https://github.com/apache/calcite/tree/f1db79fb876ac9ba3c405283e99bb0438e4e97be/src/main/config/checkerframework/avatica Recently there was a PR that improves error messages in Avatica: https://github.com/apache/calcite-avatica/pull/161 I am sure the PR is a great improvement, however, it fails CI in both cases: a) Current Avatica fails when it runs integration tests against Calcite (because Calcite expects old, low-detail exception messages) b) Current Calcite fails to build with "latest Avatica" because, well, Avatica produces "too good" exception messages It surfaces a true problem: we have too tight code integration between "different" systems, and it probably makes sense to have both libraries in a single repository. An alternative option is to make sure Calcite "supports" at least two Avatica versions: "previous version + one new". However, the current tests in Calcite expect a specific error message, so it can't support two alternative messges. Well, the tests are in .iq format which could probably support multiple messages, however, I have absolutely no idea how to implement that. Facts so far: * Avatica has fewer commits than Calcite, so having a separate calcite-avatica repository does not help for segregating PR/issue/commit queue * Calcite seems to support one specific Avatica version only, so it makes sense to just keep them in a single repository * calcite-avatica-go seems to reside in its own repository, so I do not see why do we split Java implementations across calcite and calcite-avatica repository * There is non-trivial maintenance overhead (see 1..5 above). Frankly speaking, I was trying my best to **avoid** maintaining calcite-avatica. Somebody wanted to go into a separate repository, so, I let them do what they want there. However, there are cases when I have to spend extra time because calcite-avatica is a separate repository (PR161, @Nullable are the recent samples) * It looks like I broke the build by merging PR#161. That is why I am trying to roll the thing forward and bring this discussion. An alternative option is I revert the merge and wait for somebody else to pick up the task. So my questions are: Q1) Does having calcite-avatica as a separate repository do anybody any good? Q2) Does anybody object to merging calcite-avatica and calcite into a single calcite repository? Vladimir
Re: [VOTE] Release Apache Calcite 1.28.0 (release candidate 0)
I was able to reproduce the same release artifact (d61c4935d7d3b664..) locally, so +1 for releasing it. Steps for macOS: git worktree add ../calcite-1.28.0 dec167ac18272c0cd8be477d6b162d7a31a62114 cd ../calcite-1.28.0 JAVA_HOME=$(/usr/libexec/java_home -v 1.8) ./gradlew distTar -Prelease openssl dgst -sha512 release/build/distributions/apache-calcite-1.28.0-src.tar.gz -> d61c4935d7d3b66425ff6e0c5e6e506b56f734d562f0fc6e69dc1d484da1d92d7d922930fd7b545abe0dc7b0178f6847ff4239ad52c52c8fa155668aa3d3fc38 Vladimir
Re: Can`t build master of calcite.
>Does no other build it locally for such a time ?:) Evgeny, it is very likely you hit the bug because you use a stale version of OpenJDK. What Stamatis merged is not a fix, but it is a workaround. In any case, if you use an outdated Java version, it is up to you to figure out why the build does not work. Vladimir
[jira] [Created] (CALCITE-4832) Refactor Gradle to build-logic convention plugins
Vladimir Sitnikov created CALCITE-4832: -- Summary: Refactor Gradle to build-logic convention plugins Key: CALCITE-4832 URL: https://issues.apache.org/jira/browse/CALCITE-4832 Project: Calcite Issue Type: Improvement Components: build Reporter: Vladimir Sitnikov Assignee: Vladimir Sitnikov See https://docs.gradle.org/current/userguide/structuring_software_products.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Can`t build master of calcite.
What is the exact Java version you are using? It looks like https://bugs.openjdk.java.net/browse/JDK-8032211 , and it might be that newer OpenJDK include the fix. Vladimir
[jira] [Created] (CALCITE-4829) Bump Gradle to 7.2 and test with Java 17 at GitHub Actions
Vladimir Sitnikov created CALCITE-4829: -- Summary: Bump Gradle to 7.2 and test with Java 17 at GitHub Actions Key: CALCITE-4829 URL: https://issues.apache.org/jira/browse/CALCITE-4829 Project: Calcite Issue Type: Improvement Components: build, core Reporter: Vladimir Sitnikov -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Avatica dry-run
I'm not sure which steps are you following, however, the same sequence seems to work fine: Commands: https://github.com/vlsi/vlsi-release-plugins/blob/83c85c5faa4c7cd1fe0173b75c1cba5e60c3f209/.github/workflows/release-test.yml#L33-L60 Logs: https://github.com/vlsi/vlsi-release-plugins/runs/3794304055?check_suite_focus=true Vladimir пн, 4 окт. 2021 г. в 20:15, Julian Hyde : > As release manager for the upcoming Avatica 1.19, I just tried to use > the docker-based dry-run. I got the following failure: > > Build calcite-avatica FAILURE reason: > Execution failed for task ':initializeNexusStagingRepository': > java.io.UncheckedIOException: java.net.ConnectException: > Failed to connect to /127.0.0.1:8080 > at > de.marcphilipp.gradle.nexus.internal.NexusClient.findStagingProfileId(NexusClient.kt:84) > at > de.marcphilipp.gradle.nexus.InitializeNexusStagingRepository.determineStagingProfileId(InitializeNexusStagingRepository.kt:113) > at > de.marcphilipp.gradle.nexus.InitializeNexusStagingRepository.access$determineStagingProfileId(InitializeNexusStagingRepository.kt:38) > > I don't want to spend a morning debugging Docker port-mappings so I'm > moving on. > > Julian >
[jira] [Created] (CALCITE-4824) ServerUnParserTest#testCreateTableVirtualColumn fails
Vladimir Sitnikov created CALCITE-4824: -- Summary: ServerUnParserTest#testCreateTableVirtualColumn fails Key: CALCITE-4824 URL: https://issues.apache.org/jira/browse/CALCITE-4824 Project: Calcite Issue Type: Improvement Components: server Reporter: Vladimir Sitnikov It looks like {{org.apache.calcite.server.ServerDdlExecutor#INSTANCE}} results in non-synchronized HashMap instances in ReflectUtil. Sample failure: {noformat} ServerUnParserTest > testCreateTableVirtualColumn() STANDARD_ERROR log4j:WARN No appenders could be found for logger (org.apache.calcite.sql.parser). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. WARNING 2.8sec, 442 completed, 0 failed, 5 skipped, org.apache.calcite.test.ServerParserTest WARNING 3.8sec, 442 completed, 0 failed, 6 skipped, org.apache.calcite.test.ServerUnParserTest 2.2sec, org.apache.calcite.test.ServerTest > testVirtualColumn() org.apache.calcite.test.ServerQuidemTest > test(String)[3], [3] sql\table.iq failure marker FAILURE 3.6sec, org.apache.calcite.test.ServerQuidemTest > test(String)[3], [3] sql\table.iq org.opentest4j.AssertionFailedError: Files differ: D:\a\calcite\calcite\server\build\resources\test\sql\surefire\sql\table.iq D:\a\calcite\calcite\server\build\resources\test\sql\table.iq 23c23,155 < (0 rows modified) --- > java.sql.SQLException: Error while executing SQL "create table t (i int, j int not null)": Method not found: execute([class org.apache.calcite.sql.SqlNode, interface org.apache.calcite.jdbc.CalcitePrepare$Context]) > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163) > at org.apache.calcite.avatica.AvaticaStatement.executeLargeUpdate(AvaticaStatement.java:246) > at org.apache.calcite.avatica.AvaticaStatement.executeUpdate(AvaticaStatement.java:240) > at net.hydromatic.quidem.Quidem.update(Quidem.java:279) > at net.hydromatic.quidem.Quidem.access$2900(Quidem.java:54) > at net.hydromatic.quidem.Quidem$ContextImpl.update(Quidem.java:1752) > at net.hydromatic.quidem.Quidem$UpdateCommand.execute(Quidem.java:1152) > at net.hydromatic.quidem.Quidem$CompositeCommand.execute(Quidem.java:1548) > at net.hydromatic.quidem.Quidem.execute(Quidem.java:216) > at org.apache.calcite.test.QuidemTest.checkRun(QuidemTest.java:156) > at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) > at org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185) > at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189) > at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) > at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) > at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) > at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) > Caused by: java.lang.IllegalArgumentException: Method not found: execute([class org.apache.calcite.sql.SqlNode, interface org.apache.calcite.jdbc.CalcitePrepare$Context]) > at org.apache.calcite.util.ReflectUtil$2.lookupMethod(ReflectUtil.java:563) > at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:527) > at org.apache.calcite.server.DdlExecutorImpl.executeDdl(DdlExecutorImpl.java:41) > at org.apache.calcite.prepare.CalcitePrepareImpl.executeDdl(CalcitePrepareImpl.java:369) > at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:634) > at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:513) > at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:483) > at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:249) > at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:623) > at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > ... 116 more {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Move CalciteAssert and similar "test framework" classes to its own testlib module
As suggested above, I've prepared a PR that moves CalciteAssert, Matchers, and a couple of test schemata to the new :testkit module. calcite-*-tests.jar will no longer be published. PR: https://github.com/apache/calcite/pull/2558 The tests pass, and the change unlocks migration to Gradle 7.2 and Java 17 (the migration is included in PR#2558). I'm going to merge the PR soon. Vladimir
[jira] [Created] (CALCITE-4823) JEP 411: AccessControlException, AccessController in java.security has been deprecated and marked for removal
Vladimir Sitnikov created CALCITE-4823: -- Summary: JEP 411: AccessControlException, AccessController in java.security has been deprecated and marked for removal Key: CALCITE-4823 URL: https://issues.apache.org/jira/browse/CALCITE-4823 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov Calcite uses java.security APIs that have been marked for removal. Let's minimize the usages of java.security for now. {noformat} /Users/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/util/SaffronProperties.java:28: warning: [removal] AccessControlException in java.security has been deprecated and marked for removal import java.security.AccessControlException; ^ /Users/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java:27: warning: [removal] AccessControlException in java.security has been deprecated and marked for removal import java.security.AccessControlException; ^ error: warnings found and -Werror specified /Users/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/runtime/Resources.java:991: warning: [removal] AccessController in java.security has been deprecated and marked for removal return java.security.AccessController.doPrivileged( ^ /Users/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/util/SaffronProperties.java:137: warning: [removal] AccessControlException in java.security has been deprecated and marked for removal } catch (AccessControlException e) { ^ /Users/runner/work/calcite/calcite/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java:439: warning: [removal] AccessControlException in java.security has been deprecated and marked for removal } catch (AccessControlException e) { ^ Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. 1 error 5 warnings :core:compileJava failure marker {noformat} What do we do with that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4821) Move utility test classes into calcite-testkit and unpublish -test.jar
Vladimir Sitnikov created CALCITE-4821: -- Summary: Move utility test classes into calcite-testkit and unpublish -test.jar Key: CALCITE-4821 URL: https://issues.apache.org/jira/browse/CALCITE-4821 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov -test.jar make it hard to support build configuration, and it blocks migration to Gradle 7. As we move the commonly used classes to its own -testkit module, it would make it easier to reuse test helpers, it would make easier to recognize the Test API vs {{@Test}} code, and it would make certain classes smaller. E.g. CatchAllSchema is currently located in ReflectiveSchemaTest, and if we move it to a top-level class, then ReflectiveSchemaTest would be smaller. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Add extra check for Avatica to run tests with different timezone
>I don’t think we should do randomized CI Then please incorporate: a) Testing with "-XX:+UnlockExperimentalVMOptions -XX:hashCode=2" to ensure the code does not rely on Object#hashCode uniqueness b) different OSes c) Different JDK vendors, versions and JIT implementations (e.g. OpenJ9, Microsoft, Coretto) d) interference of a, b, c, and Guava versions Vladimir
Re: [DISCUSS] Add extra check for Avatica to run tests with different timezone
Here's a randomized matrix for Avatica: https://github.com/apache/calcite-avatica/pull/156 https://github.com/apache/calcite-avatica/actions/runs/1298777387 Vladimir
Re: [DISCUSS] Add extra check for Avatica to run tests with different timezone
I suggest to settle on the number of CI jobs we can afford to keep the CI duration sane, and we generate a random sample. The number of CI jobs might be different for PR builds and for branch builds. We have many different interesting timezones, many Guava versions, only 3 OS, many JDKs for each OS, so builing diagonal is not really trivial. Vladimir
Re: [DISCUSS] Add extra check for Avatica to run tests with different timezone
>CI should be deterministic That is an interesting view. One of the approaches that I have not considered earlier is "seed matrix randomization from PR index and the branch name". That would make the set of jobs consistent for each execution of the same PR. However, based on my testng/testng experience, it is just fine if the set of CI jobs varies. The CI job description does clarify what was special (e.g. timezone, JVM, OS, etc). >And also tests with different Guava versions Do you suggest test all OS, all Guava versions, all timezones in CI? That is hard to maintain and it takes CI time >CI should be deterministic. So that if it breaks, the person to fix it is clear CI resources are limited, and we can't infinitely scale the build matrix. I do not really see why Guava versions are more important than operating systems. The users should prefer recent Guava versions, and the idea of supporting (and testing!) ancient Guava versions is doubtful. On the other hand, we might want to verify different cases like -XX:hashCode=2 that makes every Object#hashCode() to return 1 always. Vladimir
Re: [DISCUSS] Add extra check for Avatica to run tests with different timezone
>Of course it would be fine to just change the timezone for one of more >existing tests which are targeting different JVM versions Alessandro, team, what do you think of https://github.com/vlsi/github-actions-random-matrix ? I'm kind of hesitant to introduce more and more /vlsi/ dependencies even though I believe they are the best tools available :) random-matix is a 200 line script that generates build matrix by selecting random values for the axes (e.g. jvm version, os, guava version, etc). Vladimir
Re: [DISCUSS] Remove contributors name from commit summary
Stamatis>I don't think someone is spending time going over the list of contributors I agree here. It should be automated in order to be workable. The contributors link you suggest is nice. Vladimir
Re: [DISCUSS] Remove contributors name from commit summary
I incline to "The following people contributed to this release" For instance, I like the way Gradle release notes are written: https://docs.gradle.org/7.2/release-notes.html The trick is that the author name does not convey the URL, so either we use just names or we keep a contributor database. Vladimir
[jira] [Created] (CALCITE-4715) Create CITATION.cff for BibTex, APA citation UI in GitHub
Vladimir Sitnikov created CALCITE-4715: -- Summary: Create CITATION.cff for BibTex, APA citation UI in GitHub Key: CALCITE-4715 URL: https://issues.apache.org/jira/browse/CALCITE-4715 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov See https://twitter.com/natfriedman/status/1420122675813441540 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Changing the Parser
https://issues.apache.org/jira/browse/CALCITE-4701 seems relevant. I do not know if "parser extension" build system plugins already exist, however, I think it would be nice to have one in Calcite. Vladimir
[jira] [Created] (CALCITE-4701) Implement and publish Gradle plugin for extending the parser
Vladimir Sitnikov created CALCITE-4701: -- Summary: Implement and publish Gradle plugin for extending the parser Key: CALCITE-4701 URL: https://issues.apache.org/jira/browse/CALCITE-4701 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.27.0 Reporter: Vladimir Sitnikov Currently, calcite-core.jar contains the relevant parser template files, however, the usage of the templates is not clear from the client perspective (especially when you are new to Calcite). It would be nice if Calcite published a plugin that simplifies building a parser extension. We could reuse the same plugin in calcite-babel module. Currently babel references $rootDir/core/... directrly rather than fetching the parser templates from core jar. It might make sense to ship parser templates in its own jar file, however, I have no strong opinion here. Parser.jj consumes <1% of the compressed jar size(225K compresses to 44K), so it is not a big deal, yet having a meaningful artifact name for the parser templates looks like the right thing to do. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4679) Search/sarg simplification leaves is not null(literal) unsimplified
Vladimir Sitnikov created CALCITE-4679: -- Summary: Search/sarg simplification leaves is not null(literal) unsimplified Key: CALCITE-4679 URL: https://issues.apache.org/jira/browse/CALCITE-4679 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov Case: {{SEARCH(100500, Sarg[=])}} (Sarg[=] is "all values match, unknown as unknown"). While the issue does not look severe, it results in rex fuzzer false positives, so it might mask true bugs. Expected: {{true}} Actual: {noformat} unknown as unknown: OR(IS NOT NULL(0), null) unknown as false: IS NOT NULL(0) unknown as true: true <-- finally got the right answer {noformat} Test: {code:java} checkSimplify( rexBuilder.makeCall( SqlStdOperatorTable.SEARCH, literal(BigDecimal.ZERO), rexBuilder.makeSearchArgumentLiteral( Sarg.of(RexUnknownAs.UNKNOWN, ImmutableRangeSet.of(Range.all())), tInt())), "true" ); {code} --- fuzzer: {code:java} @Test void singleFuzzyTest() { Random r = new Random(); r.setSeed(-8889103384303613092L); RexFuzzer fuzzer = new RexFuzzer(rexBuilder, typeFactory); generateRexAndCheckTrueFalse(fuzzer, r); }{code} {noformat} AssertionFailedError: SEARCH(100500, Sarg[=]) rexBuilder.makeCall(SqlStdOperatorTable.SEARCH, literal(100500), literal(Sarg[=])) isAlwaysTrue, so it should simplify to TRUE unknownAsFalse ==> expected: but was: at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:69) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:188) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1146) at org.apache.calcite.test.fuzzer.RexProgramFuzzyTest.checkUnknownAs(RexProgramFuzzyTest.java:251) at org.apache.calcite.test.fuzzer.RexProgramFuzzyTest.checkUnknownAsAndShrink(RexProgramFuzzyTest.java:202) at org.apache.calcite.test.fuzzer.RexProgramFuzzyTest.checkUnknownAs(RexProgramFuzzyTest.java:165) at org.apache.calcite.test.fuzzer.RexProgramFuzzyTest.generateRexAndCheckTrueFalse(RexProgramFuzzyTest.java:454) at org.apache.calcite.test.fuzzer.RexProgramFuzzyTest.singleFuzzyTest(RexProgramFuzzyTest.java:463) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4678) AssertionError: result mismatch when simplifying case+search+isdistinctfrom+isnottrue
Vladimir Sitnikov created CALCITE-4678: -- Summary: AssertionError: result mismatch when simplifying case+search+isdistinctfrom+isnottrue Key: CALCITE-4678 URL: https://issues.apache.org/jira/browse/CALCITE-4678 Project: Calcite Issue Type: Improvement Components: core Reporter: Vladimir Sitnikov Frankly speaking, I do not know what triggers the issue. so the issue summary is abstract for now. I do not know the expected outcome, however, I expect that Calcite should not fail with AssertionErrors. Test case: {code:java} checkSimplifyAs( isNotTrue( case_( rexBuilder.makeCall( SqlStdOperatorTable.SEARCH, vInt(1), rexBuilder.makeSearchArgumentLiteral( Sarg.of( RexUnknownAs.TRUE, ImmutableRangeSet.of(Range.lessThan(BigDecimal.ZERO))), tInt())), trueLiteral, le(trueLiteral, isDistinctFrom(literal(0), literal(0), RexUnknownAs.TRUE, is("no idea what is expected")); {code} Error: {noformat} result mismatch (unknown as TRUE): when applied to {?0.int1=NULL}, IS NOT TRUE(CASE(SEARCH(?0.int1, Sarg[(-∞..0); NULL AS TRUE]), true, <=(true, IS DISTINCT FROM(0, 0 yielded false; AND(IS NOT DISTINCT FROM(0, 0), >=(?0.int1, 0)) yielded true java.lang.AssertionError: result mismatch (unknown as TRUE): when applied to {?0.int1=NULL}, IS NOT TRUE(CASE(SEARCH(?0.int1, Sarg[(-∞..0); NULL AS TRUE]), true, <=(true, IS DISTINCT FROM(0, 0 yielded false; AND(IS NOT DISTINCT FROM(0, 0), >=(?0.int1, 0)) yielded true at org.apache.calcite.rex.RexSimplify.verify(RexSimplify.java:2098) at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:250) at org.apache.calcite.rex.RexProgramTestBase.checkSimplifyAs(RexProgramTestBase.java:152){noformat} The relevant fuzzer-driven case is {code:java} @Test void singleFuzzyTest() { Random r = new Random(); r.setSeed(6192825841324574146L); RexFuzzer fuzzer = new RexFuzzer(rexBuilder, typeFactory); generateRexAndCheckTrueFalse(fuzzer, r); }{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] New committer: Vladimir Ozerov
Vladimir, welcome! >consistent signal from the "field" - lack of documentation blocks I wonder if logging "missing sections" would help. The catch here is if you work with Calcite for some time the missing docs become a blind spot. For instance, once I heard that if one wants to add a test case, then it is not clear which class/approach is appropriate. If you think about it, we do have multiple ways to create tests at different levels, however, we do not suggest entry points or something like that. It might probably help if there was a layered diagram with pointers to the test class samples. Vladimir
Re: why do We use inner class for CallCopyingArgHandler within SqlShuttle rather static nested class as ArgHandlerImpl within SqlBasicVisitor.
Julian>I’m not a fan of dependency injection. But it requires mutable fields. On contrary, DI does not require mutable fields, and field injection is considered an anti-pattern in production code. For example, see https://github.com/google/guice/wiki/Injections#constructor-injection I do not see how DI could help with SqlShuttle though. Vladimir
[jira] [Created] (CALCITE-4659) Introduce AssertJ and gradually migrate to its usage instead of Hamcrest
Vladimir Sitnikov created CALCITE-4659: -- Summary: Introduce AssertJ and gradually migrate to its usage instead of Hamcrest Key: CALCITE-4659 URL: https://issues.apache.org/jira/browse/CALCITE-4659 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov AssertJ has already been discussed in CALCITE-2457, yet it was forgotten. Note: I do not suggest replacing all the assertions with AssertJ, especially "compare with golden output" ones. The current issues with {{assertThat}}-based tests: 1) {{asertTrue}}, {{assertEquals(Object, Object)}} are prone to mistakes so one can get "expected vs actual wrong". 2) Hamcrest matches are hard to discover. In other words, one cant use autocomplete menu in IDE to see which other matches are available. 3) Hamcrest matches are hard to extend/implement. For instance, we have very limited amount of Hamcrest-based matches, and we do have a lot of Calcite-specific assertions. AssertJ solves the issue of API discoverability (one writes {{assertThat(...).containsExactly("skippedTest")}}), and AssertJ has a well-known path for extensibility. We can implement Calcite-specific assertions with AssertJ style, so the tests would be more consistent. Refaster templates (see CALCITE-4658) might help to perform automatic conversion from JUnit-style assertions to AssertJ. There are templates to cleanup AssertJ assertions: https://github.com/palantir/assertj-automation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4658) Introduce refaster for automatic reformat of the code
Vladimir Sitnikov created CALCITE-4658: -- Summary: Introduce refaster for automatic reformat of the code Key: CALCITE-4658 URL: https://issues.apache.org/jira/browse/CALCITE-4658 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov Sometimes we have API usage rules which require manual actions to correct. For instance, we "forbid" the use of {{Lists#newArrayList()}}, and suggest replacing it with {{new ArrayList()}}. It would be better if the build script could perform such replacements automatically. Refaster is a smart refactoring library for Java: https://errorprone.info/docs/refaster, and it could help us to keep the code up to date. Of course, forbidden-apis checks might still be there (e.g. for those who prefer manual replacements), however, adding automatic code cleanups would be great. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4657) Replace cancel-workflow-runs with GitHub in-core cancel-in-progress feature
Vladimir Sitnikov created CALCITE-4657: -- Summary: Replace cancel-workflow-runs with GitHub in-core cancel-in-progress feature Key: CALCITE-4657 URL: https://issues.apache.org/jira/browse/CALCITE-4657 Project: Calcite Issue Type: Improvement Reporter: Vladimir Sitnikov See https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#concurrency https://github.com/potiuk/cancel-workflow-runs is deprecated -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [HELP] SqlParserImpl#jj_scan_token hangs?
Ruben, The next time the case reproduces, please collect the core dump. At least you would be able to analyze it later. It might be a VM issue (e.g. failing compilation), or it might be a concurrency issue (e.g. data race in the parser code), or something completely different. You might find [1] fun to read and you might like to follow step-by-step should you have a JVM stuck condition again :) PS. core dump might contain private keys, passwords, so be careful if sharing the dump with someone. PPS. It might be great to pull [2] somehow, however, the status of javacc maintenance is, well, very very sad. They barely maintain javacc, and PRs do not seem to be welcome, so having javacc-related issues is sad :( [1] https://medium.com/@vladimirsitniko/analyzing-a-stuck-hotspot-c2-compilation-85e0ca230744 [2] https://github.com/javacc/javacc/pull/153 Vladimir
Re: [VOTE] Release apache-calcite-1.27.0 (release candidate 0)
Stamatis, thanks for preparing the release. My vote is [x] -1 Do not release this package because... 1) apache-calcite-1.27.0-src.tar.gz contains compiled code, and it is forbidden to release compiled code in source packages. site/js/html5shiv.min.js site/js/respond.min.js 2) apache-calcite-1.27.0-src.tar.gz contains binary files, which should not appear in source releases: site/favicon.ico site/img/cake.jpg, feather.png, and other binary blobs (9 total) site/_sass/_style.scss -- contains obscure blobs like PD94bWwgdmVyc2lvbj0iMS4wIiA/Pgo8c3ZnIHhtbG5 3) /LISENCE file in git mentions <>, however, there's no licenses file in git. It causes confusion for those who look at git sources, so they don't understand if those are optional or mandatory features or whatever. One of the options to resolve 1&2 is to remove .min.js, .ico, .png, and .scss files from the release completely. Vladimir
Re: Exposing multiple values of a trait from the operator
>VolcanoPlanner flattens the composite trait into the default trait value in RelSet.add -> RelTraitSet.simplify Vladimir, have you tried removing that RelTraitSet.simplify? I remember I have run into that multiple times already, and I suggested removing that "simplify". For example, https://issues.apache.org/jira/browse/CALCITE-2593?focusedCommentId=16750377=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16750377 Vladimir
Re: [RESULT][VOTE] Release apache-calcite-avatica-1.18.0 (release candidate 1)
Here's a clarification from Roy: https://issues.apache.org/jira/browse/LEGAL-570?focusedCommentId=17349841=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17349841 >We do not distribute source packages that contain compiled code for execution by >the recipient *unless that compilation is a human-readable script* That pretty much forbids releasing minified files as a part of the source release. Vladimir вт, 18 мая 2021 г. в 14:14, Francis Chuang : > Thanks to everyone who has tested the release candidate and given their > comments and votes. > > The tally is as follows. > > 3 binding +1s: > Francis Chuang > Julian Hyde > Stamatis Zampetakis > > 1 binding -1: > Vladimir Sitnikov > > 1 non-binding +1s: > Alessandro Solimando > > Therefore, I am delighted to announce that the proposal to release > Apache Calcite Avatica 1.18.0 has passed. > > Thanks everyone. We’ll now roll the release out to the mirrors. > > Francis >
[jira] [Created] (CALCITE-4618) Refine dependency declarations to better account for type annotations
Vladimir Sitnikov created CALCITE-4618: -- Summary: Refine dependency declarations to better account for type annotations Key: CALCITE-4618 URL: https://issues.apache.org/jira/browse/CALCITE-4618 Project: Calcite Issue Type: Improvement Affects Versions: 1.26.0 Reporter: Vladimir Sitnikov -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-4612) Fuzzer for RelBuilder
Vladimir Sitnikov created CALCITE-4612: -- Summary: Fuzzer for RelBuilder Key: CALCITE-4612 URL: https://issues.apache.org/jira/browse/CALCITE-4612 Project: Calcite Issue Type: Test Components: core Reporter: Vladimir Sitnikov A generator of random rels would be useful for testing: a) Generate rel, write it down as SQL (ensures "Rel to SQL" does not fail), parse it back (ensures parser works OK), plans the tree (ensures the rules do not throw exceptions) b) Generate rel, serialize with one of the adapters (e.g. serialize to ElasticSearch). c) Generate rel, and execute it with https://github.com/sqlancer/sqlancer (or alike) approach to verify the results of the query. It might surface **correctness** bugs in optimizer, not just NPEs -- This message was sent by Atlassian Jira (v8.3.4#803005)