[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations in the presence of aliasing

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Summary: RelMdCollation#project can return an incomplete list of collations 
in the presence of aliasing  (was: RelMdCollation#project can return an 
incomplete list of collations)

> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829666#comment-17829666
 ] 

Julian Hyde commented on CALCITE-6338:
--

Makes sense. I would mention 'in the presence of aliasing' (as in 
[aliasing|https://en.wikipedia.org/wiki/Aliasing_(computing)]) or something in 
the summary, and in your test case.

I would like to see a test case where there are multiple aliased columns. If 1 
and 2 are aliases, and 3 and 4 are aliases, therefore [1 3] [1 4] [2 3] [2 4] 
are equivalent collations.

I don't like how in your implementation one line became 20. I used to 
understand that method, now I no longer do. Introduce abstractions so that the 
implementation is at most 2 or 3 lines.

> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6015) AssertionError during optimization of EXTRACT expression

2024-03-21 Thread Mihai Budiu (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihai Budiu resolved CALCITE-6015.
--
Resolution: Fixed

Fixed in 
https://github.com/apache/calcite/commit/4823cb7760913f236e7f0f2cb149325b55a3f124

> AssertionError during optimization of EXTRACT expression
> 
>
> Key: CALCITE-6015
> URL: https://issues.apache.org/jira/browse/CALCITE-6015
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Assignee: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> The following test added to RelOptRules test causes an AssertionError:
> {code:java}
>   @Test void testExtractDayFromTime() {
> final String sql = "select EXTRACT(DAY FROM TIME'10:00:00')";
> sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS)
> .check();
>   }
> {code}
> The bottom of the stack trace is:
> {code:java}
> java.lang.AssertionError: unexpected TIME
>   at 
> org.apache.calcite.adapter.enumerable.RexImpTable$ExtractImplementor.implementSafe(RexImpTable.java:3056)
>   at 
> org.apache.calcite.adapter.enumerable.RexImpTable$AbstractRexCallImplementor.genValueStatement(RexImpTable.java:3796)
>   at 
> org.apache.calcite.adapter.enumerable.RexImpTable$AbstractRexCallImplementor.implement(RexImpTable.java:3758)
>   at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:1184)
>   at 
> org.apache.calcite.adapter.enumerable.RexToLixTranslator.visitCall(RexToLixTranslator.java:101)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:189)
> {code}
> This expression is indeed illegal. Perhaps validation should produce an error?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Description: 
{{RelMdCollation#project}} can return an incomplete list of collations.

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.



  was:
{{RelMdCollation#project}} can return an incomplete list of collations.

(I'll try to produce a unit test, for now I'll just describe the situation)

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.




> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete 

[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-6338:

Labels: pull-request-available  (was: )

> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> (I'll try to produce a unit test, for now I'll just describe the situation)
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Fix Version/s: 1.37.0

> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> (I'll try to produce a unit test, for now I'll just describe the situation)
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-6338:
--

 Summary: RelMdCollation#project can return an incomplete list of 
collations
 Key: CALCITE-6338
 URL: https://issues.apache.org/jira/browse/CALCITE-6338
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.36.0
Reporter: Ruben Q L
Assignee: Ruben Q L


{{RelMdCollation#project}} can return an incomplete list of collations.

(I'll try to produce a unit test, for now I'll just describe the situation)

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)