Feel free to use ./gradlew test prepareVote
I see no reason why preparing a release candidate should execute tests.
PS. Do you miss those days when maven-release-plugins executed the tests
three times? :)
Vladimir
Hi,
The problem is ProjectLimeRel should have **lime** rel as input,
however now it has LogicalFilter as an input.
The issue is caused by the following line:
https://github.com/tglanz/limestone/blob/master/core/src/main/java/org/tglanz/limestone/rules/LogicalProjectConverterRule.java#L29
You
>The RexNode normalization and project names remove from digest did change
a lot of plan from the Apache Flink side
Hey Danny, I see it is dissatisfying, however, it is really sad you have
never revealed which plans required changes and why.
>The java doc doesn’t distinguish main API and test
Stamatis>It is worth mentioning that the build passed with the 3rd attempt.
In the
Stamatis>first attempts the build was stuck while performing style checking.
Do you have threaddumps?
Have you filed JIRA for the locking?
Vladimir
Danny>I have created a build for Apache Calcite 1.22.0, release
Danny>candidate 3.
Great.
Checksums, pgp match.
tests pass (modulo known issues like OsAdapterTest), mat-calcite-tests pass
So +1 (binding)
There's an issue that -Prc=.. parameter is always required for building a
release version.
Julian>Can we keep it consistent please?
It's good to find bugs like this, but it's depressing to only be
finding them in RC3.
Frankly speaking, I find it too much repetition to have calcite- in a
version name.
We do not have Avatica in the repository, so why should we have long
versions like
>./gradlew removeStaleArtifacts -Pasf
>How can i use this cmd to remove only one rc files ?
It is configured in staleRemovalFilters block in
https://github.com/apache/calcite/blob/a549342294062eba9aa3196e7e6d4bda36fa8291/build.gradle.kts#L123
However, currently, there's no way to override filters
Danny>Actually I do not know how to config the authentication for the
eclipse plugin that the Gradle task uses, is there any doc/instructions
that I can reference ?
I have never tried (I always release via command line like ./gradlew
prepareVote -Prc=2 -Pasf <-- it is just enough), however, the
>1. I did checked every tar/zip checksums before/after the release, would
write id down in the mail next time ~
Ok, it looks like I did not express it properly.
Danny>The artifacts to be voted on are located here:
Danny>
https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.22.0-rc2/
Danny, thanks for putting things together, however, I guess the vote mail
requires clarifications before the votes can be cast :-/
Danny>The hashes of the artifacts are as follows:
dist.apache.org contains two archives, however, the vote mail lists just
one of them.
We had the very same case
>Does any ring bell ?
Is it related to [CALCITE-3672] Support implicit type coercion for insert
and update ?
https://issues.apache.org/jira/browse/CALCITE-3672
https://github.com/apache/calcite/pull/1720
>I am now trying to install IntelliJ, but it won't be so immediate
AFAIK it should be
I have already surfaced the case in 1.20.0 release:
https://lists.apache.org/thread.html/33694a2e754ff63e49e5fd05d52be1f72773c15f4a66adf766223b86%40%3Cdev.calcite.apache.org%3E
Technically speaking, Calcite release artifacts violate ASF licensing
policy.
Then it is up to the release manager to
>With the size and complexity of Calcite project,
... nobody can really understand anything, so there's no point of waiting
for +1 :)
Julian, Xiening you both are right.
I would add a small note that "easy to understand" titles significantly
help for reviewers.
Here's a recent case:
> And state that it is going to be removed before say 1.24 (about 6 months
from now).
I'm ok with keeping the property working as long as it does not impact
features/bugfixes development provided the property does not make the code
hard to read.
Vladimir
>adding an option
I'm ok with adding an option provided:
1) It is "remove column names" by default
2) It is supported on a best effort basis. In other words, it is NOT to be
used in production systems on a day-to-day basis
3) We do not add an extensive test suite for that property (see #1, #2)
4)
>it is hard for me to figure out the reason of the change
It is written in the JIRA description and in the commit message: it
improves planning performance by reducing the planning space.
For Calcite it reduces slow tests from 64 min to 40 min.
Before (3862sec):
Hi,
GitHub has a dependency graph feature (see [1]) which can show dependency
information right at the GitHub page.
However, the only way they support Java dependencies is via pom.xml files
(see "Supported package ecosystems" in [1]).
Sample output:
Have you seen https://github.com/apache/calcite/pull/1806 ?
It might be relevant
Vladimir
The subject of the mail does not match the body.
Please double check and ensure you mention a single change.
PS if you use explain_digest attributes, it means you expect it might
change due to implementation details.
Vladimir
Vladimir Sitnikov created CALCITE-3786:
--
Summary: Add Digest (HashStrategy?) interface to enable efficient
hashCode/equals for RexNode, RelNode
Key: CALCITE-3786
URL: https://issues.apache.org/jira/browse
>get the SQLs without set the breakpoint
+1
Explain should include relevant downstream SQL.
Yang, can you please file a JIRA ticket?
Vladimir
Hi,
I've recently committed a change that improves test results and stacktraces
in the build results.
You can find more information here in [1]
In a nutshell, it adds coloring to the output, it makes stacktraces
shorter, and it adds GitHub Actions error markers, so you don't even need
to scroll
>I uploaded the attachments to this link:
>https://drive.google.com/open?id=1fH-qlXS59BYCj9JiYieTHVDk3GnioxX-
Oh.
It means you miss dependencies (jar files) required to compile and execute
the code.
I suggest you import Calcite source code (e.g. as explained here:
>Are current -SNAPSHOT packages on repository.apache.org up to date ?
The snapshots were not up to date because Calcite-Snapshots Jenkins job was
using beam Jenkins node somehow.
I guess that was caused by misconfiguration of beam nodes.
I've triggered the job manually, and it works now:
>If you notice anything weird let me know.
Thank you.
---
I recently found that ./sqlline is non-trivial, especially when non-expert
is using it.
What do you think if we make ./sqlline to print sample commands right after
it starts?
We have pre-defined data sets.
What if ./sqlline printed a
Vladimir Sitnikov created CALCITE-3767:
--
Summary: AssertionError when SqlBinaryStringLiteral appears in
tablesample substitute
Key: CALCITE-3767
URL: https://issues.apache.org/jira/browse/CALCITE-3767
Alessandro> What do you usually do in such cases (transient failures)?
There are multiple ways (it no order):
A) Ignore the failure (as it is not related to your changes)
B) Ignore the failure and add a comment on the PR/JIRA like "CI fails with
timeout in
>My PR is this one: https://github.com/apache/calcite/pull/1774
Ok. I see. The error in your PR is exactly the same.
I guess it was caused by
https://github.com/apache/calcite/commit/bcac62e3dad6137511d4451135daa2d1762ec6ad#diff-c25007f5834d66d28e0b6644edf21325L55-R69
In other words, `master`
Hi,
Calcite never uses Jenkins for PR validation.
>I had a CI failure on my PR (on code unrelated to my PR which concerns only
CassandraAdapter),
Can you provide the link to your PR?
>Status: Failure
>Check console output at https://builds.apache.org/job/Calcite-Master/1588/ to
view the
>I made another commit so that .kt and .kts files are treated as text
Frankly speaking, we have * text=auto which means git would treat all
files as text if their first 8000 bytes does not have 00 byte.
In other words, kt files were already treated as text files.
>gradlew, sqlline, sqlsh
That
>yet autostyle is complaining
What is the exact error message?
>think it is a bug in our autostyle setup
The idea behind that is it should discover most settings from Git
configuration (e.g. git global config options), .gitattributes,
.editorconfig, and so on.
> I've tried things like "gradlew
I guess it expects EOL to match Git configuration.
>Autostyle seems to expect \n at the ends of lines and get \r\n, or vice
versa
I guess it should show what is in the file, and what it expects.
Vladimir
>The Apache Jenkins build system has built Calcite-Master (build #1577)
>Status: Failure
Filed it as https://issues.apache.org/jira/browse/CALCITE-3750
PigRelBuilderStyleTest
fails with ConcurrentModificationException
Vladimir
Vladimir Sitnikov created CALCITE-3750:
--
Summary: PigRelBuilderStyleTest fails with
ConcurrentModificationException
Key: CALCITE-3750
URL: https://issues.apache.org/jira/browse/CALCITE-3750
Vladimir Sitnikov created CALCITE-3742:
--
Summary: Update Gradle: 6.0.1 -> 6.1
Key: CALCITE-3742
URL: https://issues.apache.org/jira/browse/CALCITE-3742
Project: Calcite
Issue T
Michael>I'm not sure I see how the property
Michael>for passing tests with different plans would work
E.g. @Tag("skipWhenRandomizedRules") and CalciteAssert#explainContains
could become a no-op when a special property is passed.
Of course, the randomization should be for test purposes, not for
Vladimir Sitnikov created CALCITE-3725:
--
Summary: RelMetadataTest fails with NPE due to unsafe
RelMetadataQuery.instance call
Key: CALCITE-3725
URL: https://issues.apache.org/jira/browse/CALCITE-3725
Hi,
I've ran into RuleMatchImportanceComparator issue (see
https://issues.apache.org/jira/browse/CALCITE-2356 )
As a fun experiment, I've replaced the comparator with
Random#nextBoolean(), and it identified a bug:
Vladimir Sitnikov created CALCITE-3722:
--
Summary: Add Hook#PLAN_BEFORE_IMPLEMENTATION to capture the plan
after optimization
Key: CALCITE-3722
URL: https://issues.apache.org/jira/browse/CALCITE-3722
>Is it related with Spark version upgrade ?
No idea.
>OOM: unable to create new native thread
It means there were too many processes launched in the operating system.
It could be either Calcite test suite spawning lots of threads or
it could be related to background activity (e.g. other jobs
In case you wondered, the exception there is OOM in JVM 1.8:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:717)
at
Michael>If we want to calibrate
A part of the question is "What should Aggregate#computeSelfCost return?"
A) There's an option to make that method abstract so every sub-class
defines its own cost implementation.
It might be sad, and it might look like a NLogN duplication all over the
place.
B)
>In theory, the cardinality and uniqueness of a RelSubset should never
changed per definition of equivalent set
I agree. It is like in theory there is no difference between theory and
practice :)
What if we have select empid from (select empid from emps where empid>0)
where empid<0 ?
The
Hi,
As far as I understand, the incremental best/bestCost maintenance at
RelSubset level does not really work.
That issue is triggered a lot in MaterializationTests due to
https://issues.apache.org/jira/browse/CALCITE-3682
(MaterializationService#defineMaterialization
loses information on unique
AFAIK it is up to the release manager.
Vladimir
>Can you cherry pick the appropriate commits for the migration
>into the site branch?
What do you mean by that?
Can we just revert the invalid commits and be done with it?
I already suggested that the site update should be simplified,
and now we have clear evidence that the current process of
Francis>There is
Francis>
https://github.com/apache/calcite-site/commit/81960613e7750a9191280719352ae941a7d6a22d
,
Francis>but there doesn't appear to be any changes to the build
instructions for
Francis>gradle.
Here you go:
So far I have PR https://github.com/apache/calcite/pull/1721 that adds
verification to EnumerableTableScan constructor.
Unfortunately, we have a lot of test cases that assume the relation would
be EnumerableTableScan even
in the case test verifies logical plan only :(
That is why I added extra
Hi,
I've found there are two ways to compute the cumulative cost, and they
happen to be quite different.
VolcanoPlanner#getCost seems to be the right one (see [1])
It properly accounts for RelSubset and uses its bestCost
On the other hand, mq.getCumulativeCost(rel) is quite poor (see [2]).
It
It looks like Francis recently pushed "Update website for Avatica 1.16"
which looks like
1,674 changed files with 167,295 additions and 628,539 deletions.
https://github.com/apache/calcite-site/commit/3d92029cc7718e2b66d888dc9021047879145465
Francis, can you please double-check?
Vladimir
Can you clarify an url which has stale information?
Vladimir
Vladimir Sitnikov created CALCITE-3713:
--
Summary: Remove column names from Project#digest
Key: CALCITE-3713
URL: https://issues.apache.org/jira/browse/CALCITE-3713
Project: Calcite
Vladimir Sitnikov created CALCITE-3712:
--
Summary: Optimize lossless casts in RexSimplify: CAST(CAST(intExpr
as BIGINT) as INT) => intExpr
Key: CALCITE-3712
URL: https://issues.apache.org/jira/browse/CALC
Oh, there's org.apache.calcite.rex.RexUtil#isLosslessCast which is used in
MaterializedView*Rule, so I'll try adding that to RexSimplify
Vladimir
Hi,
I'm analyzing testJoinMaterialization5 which times out after my recent
Calcite fixes.
I'm inclined that the bug is with MaterializedViewJoinRule(Filter).
The case is as follows:
materialized view:
select cast("empid" as BIGINT) from "emps"
join "depts" using ("deptno")
input SQL:
select
Stamatis>HepPlanner does not perform cost-based decisions so for most
use-cases I
It seems to have some cost-based decisions, so providing ill-defined cost
seems wrong.
Well, let's see how new VolcanoCost would work, and then we could discuss
Hep.
Stamatis>Although it is nice to account for CPU
Vladimir Sitnikov created CALCITE-3710:
--
Summary: MaterializationService#defineMaterialization should
inherit connection properties
Key: CALCITE-3710
URL: https://issues.apache.org/jira/browse/CALCITE-3710
Vladimir Sitnikov created CALCITE-3709:
--
Summary: Use "rejected row count" for RelOptCost#getRows
Key: CALCITE-3709
URL: https://issues.apache.org/jira/browse/CALCITE-3709
Projec
In the meantime, I've created a PR that updates VolcanoCost:
https://github.com/apache/calcite/pull/1722
Vladimir
Vladimir Sitnikov created CALCITE-3708:
--
Summary: Make VolcanoCost account cpu and io properties when
comparing costs
Key: CALCITE-3708
URL: https://issues.apache.org/jira/browse/CALCITE-3708
>This of course requires the VolcanoCost to be adapted.
What do you think of HepPlanner?
It uses RelOptCostImpl.FACTORY by default which explicitly ignores CPU and
IO cost factors :((
Regarding cost#rows, there's a problem: cost#rows field does not add well
when computing cumulative cost.
What
>I think we should try to make our cost
estimations more realistic in terms of cpu and io and don't try to put
everything in rows as it is the case for various operators.
This of course requires the VolcanoCost to be adapted.
Well. The more I revise costs, the more I incline to that opinion as
Technically speaking, single-block read time for HDDs is pretty much
stable, so the use of seconds might be not that bad.
However, it seconds might be complicated to measure CPU-like activity (e.g.
different machines might execute EnumerableJoin at different rate :( )
What if we benchmark a
Hi,
It looks like testAggregateMaterializationOnCountDistinctQuery1 is invalid.
The test creates materialization for
select deptno, empid, salary from emps group by deptno, empid, salary
Then it issues the SQL:
select deptno, count(distinct empid) as c from (
select deptno, empid
from emps
Vladimir Sitnikov created CALCITE-3682:
--
Summary: MaterializationService#defineMaterialization loses
information on unique keys
Key: CALCITE-3682
URL: https://issues.apache.org/jira/browse/CALCITE-3682
Michael>although I would be hesitant to refer to "seconds"
Do you have better ideas?
If my memory serves me well, PostgreSQL uses seconds as well for cost units.
OracleDB is using "singleblock read" for the cost unit.
Michael>how long execution will take on any particular system
The idea for
Jin>In ReflectiveSchema, Statistics of FieldTable is given as UNKNOWN[1][2].
Please check[CALCITE-3661] Derive rowCount statistics for tables in
ReflectiveSchema that are based on arrays/collections
and [CALCITE-3680] Add ability to express unique constraints in
ReflectiveSchema
commits in
Vladimir Sitnikov created CALCITE-3681:
--
Summary: Refine RelMdColumnUniqueness and RelMdRowCount for
Aggregate
Key: CALCITE-3681
URL: https://issues.apache.org/jira/browse/CALCITE-3681
Project
Vladimir Sitnikov created CALCITE-3680:
--
Summary: Add ability to express unique constraints in
ReflectiveSchema
Key: CALCITE-3680
URL: https://issues.apache.org/jira/browse/CALCITE-3680
Project
Hi,
I've spent some time on stabilizing the costs (see
https://github.com/apache/calcite/pull/1702/commits ), and it looks like we
might want to have some notion of "cost unit".
For instance, we want to express that sorting table with 2 int columns is
cheaper than sorting table with 22 int
Vladimir Sitnikov created CALCITE-3677:
--
Summary: Add assertion to EnumerableTableScan constructor to
validate if the table is suitable for enumerable scan
Key: CALCITE-3677
URL: https://issues.apache.org
Hi,
It looks like MaterializationTest heavily relies on inaccurate statistics
for hr.emps and hr.depts tables.
I was trying to improve statistic estimation for better join planning (see
https://github.com/apache/calcite/pull/1712 ),
and it looks like better estimates open the eyes of the
Hi,
Stream tables do not play very well for hash joins.
In other words, if hash join would try to build a lookup table out of a
stream, it could just run out of memory.
Is there metadata or something like that to identify stream-like inputs so
hash join would ensure it does not
try to build a
Vladimir Sitnikov created CALCITE-3674:
--
Summary: EnumerableMergeJoinRule fails with NPE on nullable join
keys
Key: CALCITE-3674
URL: https://issues.apache.org/jira/browse/CALCITE-3674
Project
Vladimir Sitnikov created CALCITE-3670:
--
Summary: Add ability to release resources when connection is closed
Key: CALCITE-3670
URL: https://issues.apache.org/jira/browse/CALCITE-3670
Project
Vladimir Sitnikov created CALCITE-3666:
--
Summary: Refine RelMdColumnUniqueness for Calc
Key: CALCITE-3666
URL: https://issues.apache.org/jira/browse/CALCITE-3666
Project: Calcite
Issue
Vladimir Sitnikov created CALCITE-3665:
--
Summary: Better estimate join row count when one of the sides is
known to be unique
Key: CALCITE-3665
URL: https://issues.apache.org/jira/browse/CALCITE-3665
Vladimir Sitnikov created CALCITE-3661:
--
Summary: Derive rowCount statistics for tables in ReflectiveSchema
that are based on arrays/collections
Key: CALCITE-3661
URL: https://issues.apache.org/jira/browse
Vladimir Sitnikov created CALCITE-3660:
--
Summary: PigRelBuilderStyleTest#testImplWithJoin fails with
FrontendException: ERROR 1066: Unable to open iterator for alias t
Key: CALCITE-3660
URL: https
Vladimir Sitnikov created CALCITE-3659:
--
Summary: Optimize EnumerableMergeJoin: avoid creating
Linq4j.product for each matching group
Key: CALCITE-3659
URL: https://issues.apache.org/jira/browse/CALCITE-3659
Stamatis>This is a change that most likely will have impact on many
projects
I don't see how it will impact projects. Really.
Are there projects that use up to date Calcite versions?
Are they ready for adding a CI job to test with Calcite master branch?
It is very disappointing to hear that it
Vladimir Sitnikov created CALCITE-3657:
--
Summary: EnumerableHashJoin should not use NLogN for costing
Key: CALCITE-3657
URL: https://issues.apache.org/jira/browse/CALCITE-3657
Project: Calcite
Vladimir Sitnikov created CALCITE-3656:
--
Summary: EnumerableNestedLoopJoin cost should account for cost of
inner restarts
Key: CALCITE-3656
URL: https://issues.apache.org/jira/browse/CALCITE-3656
Vladimir Sitnikov created CALCITE-3655:
--
Summary: SortJoinTransposeRule must not push sort into Project
that contains OVER expressions
Key: CALCITE-3655
URL: https://issues.apache.org/jira/browse/CALCITE
Vladimir Sitnikov created CALCITE-3654:
--
Summary: Elasticsearch tests produce noise in the test output
Key: CALCITE-3654
URL: https://issues.apache.org/jira/browse/CALCITE-3654
Project: Calcite
The change improves slow tests from 80 min to 60, and the changes are
minimal
Vladimir
Vladimir Sitnikov created CALCITE-3652:
--
Summary: Add org.apiguardian:apiguardian-api to specify API status
Key: CALCITE-3652
URL: https://issues.apache.org/jira/browse/CALCITE-3652
Project
Technically speaking, I would love to refrain from using toString for
equals/hashCode, however,
it looks like a much more invasive change.
Yet another idea is to skip normalization when rendering a plan
with SqlExplainLevel != DIGEST_ATTRIBUTES.
In other words, the normalization is there, it is
Danny>almost all of the plan change are meaningless
What do you mean by meaningless?
The purpose of the change is to improve planning time, and to improve plan
stability.
Danny>and the execution graph is very probably relevant with the
state/storage, if we breaks them, the state also crashed
Danny>How much cases are there in production ? This example itself seems
very marginalized. I’m not against with it, I’m suspicious about the value
of the feature.
It improves JdbcTest#testJoinManyWay 2 times or so.
master.
JdbcTest#testJoinManyWay: 5.8sec
Just in case, my motivation of comparing by string length first is for the
cases like below:
=(CAST(PREV(UP.$0, 0)):INTEGER NOT NULL, 100)
vs
=(100, CAST(PREV(UP.$0, 0)):INTEGER NOT NULL)
As for me, the second one is easier to understand, do the expression starts
with simpler bits, and
the
Haisheng> variable always left, constant always right for applicable binary
operators;
Oh, I did not think of making different behavior for literals, variables.
What do you think re "$n.field = 42" where $n.field is a dot operator
I'm not fond of adding complicated checks there, however, I think
It turned out "b" (sort operands in computeDigest) is easier to implement.
I've filed a PR: https://github.com/apache/calcite/pull/1703
>($0, 2) vs <(2, $0) might be less trivial to implement, but I think it is
worth doing at the same time.
In any case, lots of expressions will need to be
Hi,
We have a 1-year old issue with an idea to sort RexNode operands so they
are consistent.
For instance, "x=5" and "5=x" have the same semantics, so it would make
sense to stick to a single implementation.
A discussion can be found in
https://issues.apache.org/jira/browse/CALCITE-2450
We do
Hi,
I'm inclined to revert
https://github.com/apache/calcite/commit/48a20668647b5a5e86073ef0e9ce206669ad6867
Motivation can be found in
https://issues.apache.org/jira/browse/CALCITE-1842?focusedCommentId=17004696=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17004696
Stamitis>I was thinking that if the check says that there is no problem
then apply
would be a noop.
The current logic of 'apply' is it computes the appropriate style and
overwrites the file.
Do you suggest it to skip overwriting in case the only diff is line endings?
What if there are other
It turned out to be more complicated than I thought.
The fix of EnumerableMergeJoin uncovered a well-known infinite planning
time issue https://issues.apache.org/jira/browse/CALCITE-2223
The thing is previously the rule did not even try to sort its inputs, thus
it was producing value only for
Vladimir Sitnikov created CALCITE-3643:
--
Summary: Prevent matching JoinCommuteRule when both inputs are the
same
Key: CALCITE-3643
URL: https://issues.apache.org/jira/browse/CALCITE-3643
Project
Hi,
I've filed a PR to activate concurrent test execution by default:
https://github.com/apache/calcite/pull/1702
It results in concurrent execution of both methods and classes.
Note: it was something that was present in Maven, and now it will be there
in Gradle as well.
It looks to work on my
Stamatis>I guess there are people who use Windows and they still have their
editors
Stamatis>configured to use LF endings.
LF / CRLF uses Git configuration to figure out the needed line endings.
In other words, if someone configures Git to use LF rather than "platform"
line endings,
the build
301 - 400 of 780 matches
Mail list logo