Re: Gradle prepareVote task does not run unit tests

2020-03-29 Thread Vladimir Sitnikov
Feel free to use ./gradlew test prepareVote I see no reason why preparing a release candidate should execute tests. PS. Do you miss those days when maven-release-plugins executed the tests three times? :) Vladimir

Re: planning error

2020-03-12 Thread Vladimir Sitnikov
Hi, The problem is ProjectLimeRel should have **lime** rel as input, however now it has LogicalFilter as an input. The issue is caused by the following line: https://github.com/tglanz/limestone/blob/master/core/src/main/java/org/tglanz/limestone/rules/LogicalProjectConverterRule.java#L29 You

Re: [Feedback][Release] Some feedback when releasing v1.22.0

2020-03-06 Thread Vladimir Sitnikov
>The RexNode normalization and project names remove from digest did change a lot of plan from the Apache Flink side Hey Danny, I see it is dissatisfying, however, it is really sad you have never revealed which plans required changes and why. >The java doc doesn’t distinguish main API and test

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 3)

2020-03-03 Thread Vladimir Sitnikov
Stamatis>It is worth mentioning that the build passed with the 3rd attempt. In the Stamatis>first attempts the build was stuck while performing style checking. Do you have threaddumps? Have you filed JIRA for the locking? Vladimir

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 3)

2020-03-02 Thread Vladimir Sitnikov
Danny>I have created a build for Apache Calcite 1.22.0, release Danny>candidate 3. Great. Checksums, pgp match. tests pass (modulo known issues like OsAdapterTest), mat-calcite-tests pass So +1 (binding) There's an issue that -Prc=.. parameter is always required for building a release version.

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 3)

2020-03-02 Thread Vladimir Sitnikov
Julian>Can we keep it consistent please? It's good to find bugs like this, but it's depressing to only be finding them in RC3. Frankly speaking, I find it too much repetition to have calcite- in a version name. We do not have Avatica in the repository, so why should we have long versions like

Re: svn commit: r38276 - /dev/calcite/apache-calcite-1.23.0-rc1/

2020-03-02 Thread Vladimir Sitnikov
>./gradlew removeStaleArtifacts -Pasf >How can i use this cmd to remove only one rc files ? It is configured in staleRemovalFilters block in https://github.com/apache/calcite/blob/a549342294062eba9aa3196e7e6d4bda36fa8291/build.gradle.kts#L123 However, currently, there's no way to override filters

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 2)

2020-03-01 Thread Vladimir Sitnikov
Danny>Actually I do not know how to config the authentication for the eclipse plugin that the Gradle task uses, is there any doc/instructions that I can reference ? I have never tried (I always release via command line like ./gradlew prepareVote -Prc=2 -Pasf <-- it is just enough), however, the

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 2)

2020-02-29 Thread Vladimir Sitnikov
>1. I did checked every tar/zip checksums before/after the release, would write id down in the mail next time ~ Ok, it looks like I did not express it properly. Danny>The artifacts to be voted on are located here: Danny> https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.22.0-rc2/

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 2)

2020-02-29 Thread Vladimir Sitnikov
Danny, thanks for putting things together, however, I guess the vote mail requires clarifications before the votes can be cast :-/ Danny>The hashes of the artifacts are as follows: dist.apache.org contains two archives, however, the vote mail lists just one of them. We had the very same case

Re: Problems on RC with HerdDB: was: [VOTE] Release apache-calcite-1.22.0 (release candidate 0)

2020-02-25 Thread Vladimir Sitnikov
>Does any ring bell ? Is it related to [CALCITE-3672] Support implicit type coercion for insert and update ? https://issues.apache.org/jira/browse/CALCITE-3672 https://github.com/apache/calcite/pull/1720 >I am now trying to install IntelliJ, but it won't be so immediate AFAIK it should be

Re: [VOTE] Release apache-calcite-1.22.0 (release candidate 1)

2020-02-25 Thread Vladimir Sitnikov
I have already surfaced the case in 1.20.0 release: https://lists.apache.org/thread.html/33694a2e754ff63e49e5fd05d52be1f72773c15f4a66adf766223b86%40%3Cdev.calcite.apache.org%3E Technically speaking, Calcite release artifacts violate ASF licensing policy. Then it is up to the release manager to

Re: [DISCUSS] Commit messages, again

2020-02-21 Thread Vladimir Sitnikov
>With the size and complexity of Calcite project, ... nobody can really understand anything, so there's no point of waiting for +1 :) Julian, Xiening you both are right. I would add a small note that "easy to understand" titles significantly help for reviewers. Here's a recent case:

Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest

2020-02-17 Thread Vladimir Sitnikov
> And state that it is going to be removed before say 1.24 (about 6 months from now). I'm ok with keeping the property working as long as it does not impact features/bugfixes development provided the property does not make the code hard to read. Vladimir

Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest

2020-02-17 Thread Vladimir Sitnikov
>adding an option I'm ok with adding an option provided: 1) It is "remove column names" by default 2) It is supported on a best effort basis. In other words, it is NOT to be used in production systems on a day-to-day basis 3) We do not add an extensive test suite for that property (see #1, #2) 4)

Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest

2020-02-16 Thread Vladimir Sitnikov
>it is hard for me to figure out the reason of the change It is written in the JIRA description and in the commit message: it improves planning performance by reducing the planning space. For Calcite it reduces slow tests from 64 min to 40 min. Before (3862sec):

A case for having pom.xml files: GitHub dependency graph

2020-02-16 Thread Vladimir Sitnikov
Hi, GitHub has a dependency graph feature (see [1]) which can show dependency information right at the GitHub page. However, the only way they support Java dependencies is via pom.xml files (see "Supported package ecosystems" in [1]). Sample output:

Re: LatticeTest#testLatticeStarTable() fails in CI after CALCITE-3774

2020-02-14 Thread Vladimir Sitnikov
Have you seen https://github.com/apache/calcite/pull/1806 ? It might be relevant Vladimir

Re: [ DISCUSS ] Revert change: CALCITE-3713 Remove column names from Project#digest

2020-02-13 Thread Vladimir Sitnikov
The subject of the mail does not match the body. Please double check and ensure you mention a single change. PS if you use explain_digest attributes, it means you expect it might change due to implementation details. Vladimir

[jira] [Created] (CALCITE-3786) Add Digest (HashStrategy?) interface to enable efficient hashCode/equals for RexNode, RelNode

2020-02-12 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3786: -- Summary: Add Digest (HashStrategy?) interface to enable efficient hashCode/equals for RexNode, RelNode Key: CALCITE-3786 URL: https://issues.apache.org/jira/browse

Re: Is there any way to view the physical SQLs executed by Calcite JDBC?

2020-02-11 Thread Vladimir Sitnikov
>get the SQLs without set the breakpoint +1 Explain should include relevant downstream SQL. Yang, can you please file a JIRA ticket? Vladimir

Test results and stacktraces in Gradle output

2020-02-10 Thread Vladimir Sitnikov
Hi, I've recently committed a change that improves test results and stacktraces in the build results. You can find more information here in [1] In a nutshell, it adds coloring to the output, it makes stacktraces shorter, and it adds GitHub Actions error markers, so you don't even need to scroll

Re: Failed to run example for JDBC

2020-02-08 Thread Vladimir Sitnikov
>I uploaded the attachments to this link: >https://drive.google.com/open?id=1fH-qlXS59BYCj9JiYieTHVDk3GnioxX- Oh. It means you miss dependencies (jar files) required to compile and execute the code. I suggest you import Calcite source code (e.g. as explained here:

Re: [DISCUSS] Towards Calcite 1.22

2020-02-08 Thread Vladimir Sitnikov
>Are current -SNAPSHOT packages on repository.apache.org up to date ? The snapshots were not up to date because Calcite-Snapshots Jenkins job was using beam Jenkins node somehow. I guess that was caused by misconfiguration of beam nodes. I've triggered the job manually, and it works now:

Re: Gradle documentation update

2020-02-06 Thread Vladimir Sitnikov
>If you notice anything weird let me know. Thank you. --- I recently found that ./sqlline is non-trivial, especially when non-expert is using it. What do you think if we make ./sqlline to print sample commands right after it starts? We have pre-defined data sets. What if ./sqlline printed a

[jira] [Created] (CALCITE-3767) AssertionError when SqlBinaryStringLiteral appears in tablesample substitute

2020-02-04 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3767: -- Summary: AssertionError when SqlBinaryStringLiteral appears in tablesample substitute Key: CALCITE-3767 URL: https://issues.apache.org/jira/browse/CALCITE-3767

Re: Calcite-Master - Build # 1588 - Failure

2020-02-01 Thread Vladimir Sitnikov
Alessandro> What do you usually do in such cases (transient failures)? There are multiple ways (it no order): A) Ignore the failure (as it is not related to your changes) B) Ignore the failure and add a comment on the PR/JIRA like "CI fails with timeout in

Re: Calcite-Master - Build # 1588 - Failure

2020-02-01 Thread Vladimir Sitnikov
>My PR is this one: https://github.com/apache/calcite/pull/1774 Ok. I see. The error in your PR is exactly the same. I guess it was caused by https://github.com/apache/calcite/commit/bcac62e3dad6137511d4451135daa2d1762ec6ad#diff-c25007f5834d66d28e0b6644edf21325L55-R69 In other words, `master`

Re: Calcite-Master - Build # 1588 - Failure

2020-02-01 Thread Vladimir Sitnikov
Hi, Calcite never uses Jenkins for PR validation. >I had a CI failure on my PR (on code unrelated to my PR which concerns only CassandraAdapter), Can you provide the link to your PR? >Status: Failure >Check console output at https://builds.apache.org/job/Calcite-Master/1588/ to view the

Re: Autostyle errors on Windows

2020-01-30 Thread Vladimir Sitnikov
>I made another commit so that .kt and .kts files are treated as text Frankly speaking, we have * text=auto which means git would treat all files as text if their first 8000 bytes does not have 00 byte. In other words, kt files were already treated as text files. >gradlew, sqlline, sqlsh That

Re: Autostyle errors on Windows

2020-01-29 Thread Vladimir Sitnikov
>yet autostyle is complaining What is the exact error message? >think it is a bug in our autostyle setup The idea behind that is it should discover most settings from Git configuration (e.g. git global config options), .gitattributes, .editorconfig, and so on. > I've tried things like "gradlew

Re: Autostyle errors on Windows

2020-01-28 Thread Vladimir Sitnikov
I guess it expects EOL to match Git configuration. >Autostyle seems to expect \n at the ends of lines and get \r\n, or vice versa I guess it should show what is in the file, and what it expects. Vladimir

Re: Calcite-Master - Build # 1577 - Failure

2020-01-20 Thread Vladimir Sitnikov
>The Apache Jenkins build system has built Calcite-Master (build #1577) >Status: Failure Filed it as https://issues.apache.org/jira/browse/CALCITE-3750 PigRelBuilderStyleTest fails with ConcurrentModificationException Vladimir

[jira] [Created] (CALCITE-3750) PigRelBuilderStyleTest fails with ConcurrentModificationException

2020-01-20 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3750: -- Summary: PigRelBuilderStyleTest fails with ConcurrentModificationException Key: CALCITE-3750 URL: https://issues.apache.org/jira/browse/CALCITE-3750

[jira] [Created] (CALCITE-3742) Update Gradle: 6.0.1 -> 6.1

2020-01-16 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3742: -- Summary: Update Gradle: 6.0.1 -> 6.1 Key: CALCITE-3742 URL: https://issues.apache.org/jira/browse/CALCITE-3742 Project: Calcite Issue T

Re: [DISCUSS] Randomize VolcanoRule execution order for better test coverage

2020-01-11 Thread Vladimir Sitnikov
Michael>I'm not sure I see how the property Michael>for passing tests with different plans would work E.g. @Tag("skipWhenRandomizedRules") and CalciteAssert#explainContains could become a no-op when a special property is passed. Of course, the randomization should be for test purposes, not for

[jira] [Created] (CALCITE-3725) RelMetadataTest fails with NPE due to unsafe RelMetadataQuery.instance call

2020-01-11 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3725: -- Summary: RelMetadataTest fails with NPE due to unsafe RelMetadataQuery.instance call Key: CALCITE-3725 URL: https://issues.apache.org/jira/browse/CALCITE-3725

[DISCUSS] Randomize VolcanoRule execution order for better test coverage

2020-01-11 Thread Vladimir Sitnikov
Hi, I've ran into RuleMatchImportanceComparator issue (see https://issues.apache.org/jira/browse/CALCITE-2356 ) As a fun experiment, I've replaced the comparator with Random#nextBoolean(), and it identified a bug:

[jira] [Created] (CALCITE-3722) Add Hook#PLAN_BEFORE_IMPLEMENTATION to capture the plan after optimization

2020-01-10 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3722: -- Summary: Add Hook#PLAN_BEFORE_IMPLEMENTATION to capture the plan after optimization Key: CALCITE-3722 URL: https://issues.apache.org/jira/browse/CALCITE-3722

Re: Calcite-Master - Build # 1554 - Failure

2020-01-09 Thread Vladimir Sitnikov
>Is it related with Spark version upgrade ? No idea. >OOM: unable to create new native thread It means there were too many processes launched in the operating system. It could be either Calcite test suite spawning lots of threads or it could be related to background activity (e.g. other jobs

Re: Calcite-Master - Build # 1554 - Failure

2020-01-09 Thread Vladimir Sitnikov
In case you wondered, the exception there is OOM in JVM 1.8: java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:717) at

Re: [DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

2020-01-09 Thread Vladimir Sitnikov
Michael>If we want to calibrate A part of the question is "What should Aggregate#computeSelfCost return?" A) There's an option to make that method abstract so every sub-class defines its own cost implementation. It might be sad, and it might look like a NLogN duplication all over the place. B)

Re: [DISCUSS] propagateCostImprovements vs incremental bestCost maintenance vs metadata

2020-01-08 Thread Vladimir Sitnikov
>In theory, the cardinality and uniqueness of a RelSubset should never changed per definition of equivalent set I agree. It is like in theory there is no difference between theory and practice :) What if we have select empid from (select empid from emps where empid>0) where empid<0 ? The

[DISCUSS] propagateCostImprovements vs incremental bestCost maintenance vs metadata

2020-01-08 Thread Vladimir Sitnikov
Hi, As far as I understand, the incremental best/bestCost maintenance at RelSubset level does not really work. That issue is triggered a lot in MaterializationTests due to https://issues.apache.org/jira/browse/CALCITE-3682 (MaterializationService#defineMaterialization loses information on unique

Re: [DISCUSS] Avatica 1.16.0 dockerfiles broken. Release 1.17.0?

2020-01-07 Thread Vladimir Sitnikov
AFAIK it is up to the release manager. Vladimir

Re: Gradle documentation update

2020-01-07 Thread Vladimir Sitnikov
>Can you cherry pick the appropriate commits for the migration >into the site branch? What do you mean by that? Can we just revert the invalid commits and be done with it? I already suggested that the site update should be simplified, and now we have clear evidence that the current process of

Re: Gradle documentation update

2020-01-07 Thread Vladimir Sitnikov
Francis>There is Francis> https://github.com/apache/calcite-site/commit/81960613e7750a9191280719352ae941a7d6a22d , Francis>but there doesn't appear to be any changes to the build instructions for Francis>gradle. Here you go:

Re: [DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

2020-01-07 Thread Vladimir Sitnikov
So far I have PR https://github.com/apache/calcite/pull/1721 that adds verification to EnumerableTableScan constructor. Unfortunately, we have a lot of test cases that assume the relation would be EnumerableTableScan even in the case test verifies logical plan only :( That is why I added extra

[DISCUSS] getCumulativeCost vs Planner#getCost(rel, mq)

2020-01-07 Thread Vladimir Sitnikov
Hi, I've found there are two ways to compute the cumulative cost, and they happen to be quite different. VolcanoPlanner#getCost seems to be the right one (see [1]) It properly accounts for RelSubset and uses its bestCost On the other hand, mq.getCumulativeCost(rel) is quite poor (see [2]). It

Re: Gradle documentation update

2020-01-07 Thread Vladimir Sitnikov
It looks like Francis recently pushed "Update website for Avatica 1.16" which looks like 1,674 changed files with 167,295 additions and 628,539 deletions. https://github.com/apache/calcite-site/commit/3d92029cc7718e2b66d888dc9021047879145465 Francis, can you please double-check? Vladimir

Re: Gradle documentation update

2020-01-07 Thread Vladimir Sitnikov
Can you clarify an url which has stale information? Vladimir

[jira] [Created] (CALCITE-3713) Remove column names from Project#digest

2020-01-06 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3713: -- Summary: Remove column names from Project#digest Key: CALCITE-3713 URL: https://issues.apache.org/jira/browse/CALCITE-3713 Project: Calcite

[jira] [Created] (CALCITE-3712) Optimize lossless casts in RexSimplify: CAST(CAST(intExpr as BIGINT) as INT) => intExpr

2020-01-06 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3712: -- Summary: Optimize lossless casts in RexSimplify: CAST(CAST(intExpr as BIGINT) as INT) => intExpr Key: CALCITE-3712 URL: https://issues.apache.org/jira/browse/CALC

Re: CALCITE-2223: testJoinMaterialization5 vs infinite match of ProjectMergeRule

2020-01-06 Thread Vladimir Sitnikov
Oh, there's org.apache.calcite.rex.RexUtil#isLosslessCast which is used in MaterializedView*Rule, so I'll try adding that to RexSimplify Vladimir

CALCITE-2223: testJoinMaterialization5 vs infinite match of ProjectMergeRule

2020-01-06 Thread Vladimir Sitnikov
Hi, I'm analyzing testJoinMaterialization5 which times out after my recent Calcite fixes. I'm inclined that the bug is with MaterializedViewJoinRule(Filter). The case is as follows: materialized view: select cast("empid" as BIGINT) from "emps" join "depts" using ("deptno") input SQL: select

Re: [DISCUSS] Revert [CALCITE-1842] Sort.computeSelfCost() calls makeCost() with arguments in wrong order

2020-01-05 Thread Vladimir Sitnikov
Stamatis>HepPlanner does not perform cost-based decisions so for most use-cases I It seems to have some cost-based decisions, so providing ill-defined cost seems wrong. Well, let's see how new VolcanoCost would work, and then we could discuss Hep. Stamatis>Although it is nice to account for CPU

[jira] [Created] (CALCITE-3710) MaterializationService#defineMaterialization should inherit connection properties

2020-01-05 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3710: -- Summary: MaterializationService#defineMaterialization should inherit connection properties Key: CALCITE-3710 URL: https://issues.apache.org/jira/browse/CALCITE-3710

[jira] [Created] (CALCITE-3709) Use "rejected row count" for RelOptCost#getRows

2020-01-05 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3709: -- Summary: Use "rejected row count" for RelOptCost#getRows Key: CALCITE-3709 URL: https://issues.apache.org/jira/browse/CALCITE-3709 Projec

Re: [DISCUSS] Revert [CALCITE-1842] Sort.computeSelfCost() calls makeCost() with arguments in wrong order

2020-01-05 Thread Vladimir Sitnikov
In the meantime, I've created a PR that updates VolcanoCost: https://github.com/apache/calcite/pull/1722 Vladimir

[jira] [Created] (CALCITE-3708) Make VolcanoCost account cpu and io properties when comparing costs

2020-01-05 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3708: -- Summary: Make VolcanoCost account cpu and io properties when comparing costs Key: CALCITE-3708 URL: https://issues.apache.org/jira/browse/CALCITE-3708

Re: [DISCUSS] Revert [CALCITE-1842] Sort.computeSelfCost() calls makeCost() with arguments in wrong order

2020-01-05 Thread Vladimir Sitnikov
>This of course requires the VolcanoCost to be adapted. What do you think of HepPlanner? It uses RelOptCostImpl.FACTORY by default which explicitly ignores CPU and IO cost factors :(( Regarding cost#rows, there's a problem: cost#rows field does not add well when computing cumulative cost. What

Re: [DISCUSS] Revert [CALCITE-1842] Sort.computeSelfCost() calls makeCost() with arguments in wrong order

2020-01-04 Thread Vladimir Sitnikov
>I think we should try to make our cost estimations more realistic in terms of cpu and io and don't try to put everything in rows as it is the case for various operators. This of course requires the VolcanoCost to be adapted. Well. The more I revise costs, the more I incline to that opinion as

Re: [DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

2020-01-04 Thread Vladimir Sitnikov
Technically speaking, single-block read time for HDDs is pretty much stable, so the use of seconds might be not that bad. However, it seconds might be complicated to measure CPU-like activity (e.g. different machines might execute EnumerableJoin at different rate :( ) What if we benchmark a

[DISCUSS] MaterializationTest#testAggregateMaterializationOnCountDistinctQuery1 is very fragile

2020-01-04 Thread Vladimir Sitnikov
Hi, It looks like testAggregateMaterializationOnCountDistinctQuery1 is invalid. The test creates materialization for select deptno, empid, salary from emps group by deptno, empid, salary Then it issues the SQL: select deptno, count(distinct empid) as c from ( select deptno, empid from emps

[jira] [Created] (CALCITE-3682) MaterializationService#defineMaterialization loses information on unique keys

2020-01-04 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3682: -- Summary: MaterializationService#defineMaterialization loses information on unique keys Key: CALCITE-3682 URL: https://issues.apache.org/jira/browse/CALCITE-3682

Re: [DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

2020-01-04 Thread Vladimir Sitnikov
Michael>although I would be hesitant to refer to "seconds" Do you have better ideas? If my memory serves me well, PostgreSQL uses seconds as well for cost units. OracleDB is using "singleblock read" for the cost unit. Michael>how long execution will take on any particular system The idea for

Re: [DISCUSS] CALCITE-3661, CALCITE-3665, MaterializationTest vs HR schema statistics

2020-01-04 Thread Vladimir Sitnikov
Jin>In ReflectiveSchema, Statistics of FieldTable is given as UNKNOWN[1][2]. Please check[CALCITE-3661] Derive rowCount statistics for tables in ReflectiveSchema that are based on arrays/collections and [CALCITE-3680] Add ability to express unique constraints in ReflectiveSchema commits in

[jira] [Created] (CALCITE-3681) Refine RelMdColumnUniqueness and RelMdRowCount for Aggregate

2020-01-04 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3681: -- Summary: Refine RelMdColumnUniqueness and RelMdRowCount for Aggregate Key: CALCITE-3681 URL: https://issues.apache.org/jira/browse/CALCITE-3681 Project

[jira] [Created] (CALCITE-3680) Add ability to express unique constraints in ReflectiveSchema

2020-01-04 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3680: -- Summary: Add ability to express unique constraints in ReflectiveSchema Key: CALCITE-3680 URL: https://issues.apache.org/jira/browse/CALCITE-3680 Project

[DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

2020-01-04 Thread Vladimir Sitnikov
Hi, I've spent some time on stabilizing the costs (see https://github.com/apache/calcite/pull/1702/commits ), and it looks like we might want to have some notion of "cost unit". For instance, we want to express that sorting table with 2 int columns is cheaper than sorting table with 22 int

[jira] [Created] (CALCITE-3677) Add assertion to EnumerableTableScan constructor to validate if the table is suitable for enumerable scan

2020-01-03 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3677: -- Summary: Add assertion to EnumerableTableScan constructor to validate if the table is suitable for enumerable scan Key: CALCITE-3677 URL: https://issues.apache.org

[DISCUSS] CALCITE-3661, CALCITE-3665, MaterializationTest vs HR schema statistics

2020-01-03 Thread Vladimir Sitnikov
Hi, It looks like MaterializationTest heavily relies on inaccurate statistics for hr.emps and hr.depts tables. I was trying to improve statistic estimation for better join planning (see https://github.com/apache/calcite/pull/1712 ), and it looks like better estimates open the eyes of the

[DISCUSS] Stream tables vs hash joins

2020-01-03 Thread Vladimir Sitnikov
Hi, Stream tables do not play very well for hash joins. In other words, if hash join would try to build a lookup table out of a stream, it could just run out of memory. Is there metadata or something like that to identify stream-like inputs so hash join would ensure it does not try to build a

[jira] [Created] (CALCITE-3674) EnumerableMergeJoinRule fails with NPE on nullable join keys

2020-01-03 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3674: -- Summary: EnumerableMergeJoinRule fails with NPE on nullable join keys Key: CALCITE-3674 URL: https://issues.apache.org/jira/browse/CALCITE-3674 Project

[jira] [Created] (CALCITE-3670) Add ability to release resources when connection is closed

2020-01-03 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3670: -- Summary: Add ability to release resources when connection is closed Key: CALCITE-3670 URL: https://issues.apache.org/jira/browse/CALCITE-3670 Project

[jira] [Created] (CALCITE-3666) Refine RelMdColumnUniqueness for Calc

2020-01-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3666: -- Summary: Refine RelMdColumnUniqueness for Calc Key: CALCITE-3666 URL: https://issues.apache.org/jira/browse/CALCITE-3666 Project: Calcite Issue

[jira] [Created] (CALCITE-3665) Better estimate join row count when one of the sides is known to be unique

2020-01-02 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3665: -- Summary: Better estimate join row count when one of the sides is known to be unique Key: CALCITE-3665 URL: https://issues.apache.org/jira/browse/CALCITE-3665

[jira] [Created] (CALCITE-3661) Derive rowCount statistics for tables in ReflectiveSchema that are based on arrays/collections

2020-01-01 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3661: -- Summary: Derive rowCount statistics for tables in ReflectiveSchema that are based on arrays/collections Key: CALCITE-3661 URL: https://issues.apache.org/jira/browse

[jira] [Created] (CALCITE-3660) PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t

2020-01-01 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3660: -- Summary: PigRelBuilderStyleTest#testImplWithJoin fails with FrontendException: ERROR 1066: Unable to open iterator for alias t Key: CALCITE-3660 URL: https

[jira] [Created] (CALCITE-3659) Optimize EnumerableMergeJoin: avoid creating Linq4j.product for each matching group

2020-01-01 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3659: -- Summary: Optimize EnumerableMergeJoin: avoid creating Linq4j.product for each matching group Key: CALCITE-3659 URL: https://issues.apache.org/jira/browse/CALCITE-3659

Re: Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-31 Thread Vladimir Sitnikov
Stamatis>This is a change that most likely will have impact on many projects I don't see how it will impact projects. Really. Are there projects that use up to date Calcite versions? Are they ready for adding a CI job to test with Calcite master branch? It is very disappointing to hear that it

[jira] [Created] (CALCITE-3657) EnumerableHashJoin should not use NLogN for costing

2019-12-31 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3657: -- Summary: EnumerableHashJoin should not use NLogN for costing Key: CALCITE-3657 URL: https://issues.apache.org/jira/browse/CALCITE-3657 Project: Calcite

[jira] [Created] (CALCITE-3656) EnumerableNestedLoopJoin cost should account for cost of inner restarts

2019-12-31 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3656: -- Summary: EnumerableNestedLoopJoin cost should account for cost of inner restarts Key: CALCITE-3656 URL: https://issues.apache.org/jira/browse/CALCITE-3656

[jira] [Created] (CALCITE-3655) SortJoinTransposeRule must not push sort into Project that contains OVER expressions

2019-12-31 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3655: -- Summary: SortJoinTransposeRule must not push sort into Project that contains OVER expressions Key: CALCITE-3655 URL: https://issues.apache.org/jira/browse/CALCITE

[jira] [Created] (CALCITE-3654) Elasticsearch tests produce noise in the test output

2019-12-31 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3654: -- Summary: Elasticsearch tests produce noise in the test output Key: CALCITE-3654 URL: https://issues.apache.org/jira/browse/CALCITE-3654 Project: Calcite

Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-30 Thread Vladimir Sitnikov
The change improves slow tests from 80 min to 60, and the changes are minimal Vladimir

[jira] [Created] (CALCITE-3652) Add org.apiguardian:apiguardian-api to specify API status

2019-12-30 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3652: -- Summary: Add org.apiguardian:apiguardian-api to specify API status Key: CALCITE-3652 URL: https://issues.apache.org/jira/browse/CALCITE-3652 Project

Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-30 Thread Vladimir Sitnikov
Technically speaking, I would love to refrain from using toString for equals/hashCode, however, it looks like a much more invasive change. Yet another idea is to skip normalization when rendering a plan with SqlExplainLevel != DIGEST_ATTRIBUTES. In other words, the normalization is there, it is

Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-30 Thread Vladimir Sitnikov
Danny>almost all of the plan change are meaningless What do you mean by meaningless? The purpose of the change is to improve planning time, and to improve plan stability. Danny>and the execution graph is very probably relevant with the state/storage, if we breaks them, the state also crashed

Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-29 Thread Vladimir Sitnikov
Danny>How much cases are there in production ? This example itself seems very marginalized. I’m not against with it, I’m suspicious about the value of the feature. It improves JdbcTest#testJoinManyWay 2 times or so. master. JdbcTest#testJoinManyWay: 5.8sec

Re: Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-29 Thread Vladimir Sitnikov
Just in case, my motivation of comparing by string length first is for the cases like below: =(CAST(PREV(UP.$0, 0)):INTEGER NOT NULL, 100) vs =(100, CAST(PREV(UP.$0, 0)):INTEGER NOT NULL) As for me, the second one is easier to understand, do the expression starts with simpler bits, and the

Re: Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-29 Thread Vladimir Sitnikov
Haisheng> variable always left, constant always right for applicable binary operators; Oh, I did not think of making different behavior for literals, variables. What do you think re "$n.field = 42" where $n.field is a dot operator I'm not fond of adding complicated checks there, however, I think

Re: [DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-29 Thread Vladimir Sitnikov
It turned out "b" (sort operands in computeDigest) is easier to implement. I've filed a PR: https://github.com/apache/calcite/pull/1703 >($0, 2) vs <(2, $0) might be less trivial to implement, but I think it is worth doing at the same time. In any case, lots of expressions will need to be

[DISCUSS] CALCITE-2450 reorder predicates to a canonical form

2019-12-29 Thread Vladimir Sitnikov
Hi, We have a 1-year old issue with an idea to sort RexNode operands so they are consistent. For instance, "x=5" and "5=x" have the same semantics, so it would make sense to stick to a single implementation. A discussion can be found in https://issues.apache.org/jira/browse/CALCITE-2450 We do

[DISCUSS] Revert [CALCITE-1842] Sort.computeSelfCost() calls makeCost() with arguments in wrong order

2019-12-29 Thread Vladimir Sitnikov
Hi, I'm inclined to revert https://github.com/apache/calcite/commit/48a20668647b5a5e86073ef0e9ce206669ad6867 Motivation can be found in https://issues.apache.org/jira/browse/CALCITE-1842?focusedCommentId=17004696=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17004696

Re: [DISCUSS] Avatica 1.16.0 dockerfiles broken. Release 1.17.0?

2019-12-29 Thread Vladimir Sitnikov
Stamitis>I was thinking that if the check says that there is no problem then apply would be a noop. The current logic of 'apply' is it computes the appropriate style and overwrites the file. Do you suggest it to skip overwriting in case the only diff is line endings? What if there are other

Re: Concurrent execution of tests methods

2019-12-29 Thread Vladimir Sitnikov
It turned out to be more complicated than I thought. The fix of EnumerableMergeJoin uncovered a well-known infinite planning time issue https://issues.apache.org/jira/browse/CALCITE-2223 The thing is previously the rule did not even try to sort its inputs, thus it was producing value only for

[jira] [Created] (CALCITE-3643) Prevent matching JoinCommuteRule when both inputs are the same

2019-12-29 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CALCITE-3643: -- Summary: Prevent matching JoinCommuteRule when both inputs are the same Key: CALCITE-3643 URL: https://issues.apache.org/jira/browse/CALCITE-3643 Project

Concurrent execution of tests methods

2019-12-28 Thread Vladimir Sitnikov
Hi, I've filed a PR to activate concurrent test execution by default: https://github.com/apache/calcite/pull/1702 It results in concurrent execution of both methods and classes. Note: it was something that was present in Maven, and now it will be there in Gradle as well. It looks to work on my

Re: [DISCUSS] Avatica 1.16.0 dockerfiles broken. Release 1.17.0?

2019-12-26 Thread Vladimir Sitnikov
Stamatis>I guess there are people who use Windows and they still have their editors Stamatis>configured to use LF endings. LF / CRLF uses Git configuration to figure out the needed line endings. In other words, if someone configures Git to use LF rather than "platform" line endings, the build

<    1   2   3   4   5   6   7   8   >