[jira] [Comment Edited] (CALCITE-3594) Support hot Groupby keys hint
[ https://issues.apache.org/jira/browse/CALCITE-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995369#comment-16995369 ] Rui Wang edited comment on CALCITE-3594 at 12/14/19 4:38 AM: - [~danny0405] first of all, thanks for your detailed written [Hint doc|https://docs.google.com/document/d/1mykz-w2t1Yw7CH6NjUWpWqCAf_6YNKxSc59gXafrNCs/edit#]. I learned a lot from it. {quote}define a hint keyword (We may some reference with other engines to give it a readable name){quote} I think you are saying if there is a better name or a common name used by other engines for "hot_key"? {quote}define the hint options(either a simple identifier list or k-v list), need to make the options as much common to use{quote} This is a tricky one. I see that there is a support for kv list where key is SqlIdentifier and value is just a string literal. In the hot key case, we will need a list of kv pair where key should be a literal. It is because groupby keys are columns that can have different types. So my current thought is adding support of list of kv pair: Literal:Literal, as a hint's body. What do you think? {quote}define the hint strategy for the hint item (see if you need some special conditions for the relational expressions){quote} Agreed. In [HintStrategies|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/hint/HintStrategies.java#L22], there is no a strategy for aggregation, I probably should add one. Also Aggregrate might should implement Hintable interface as well. Lastly, in parser I believe there is no support to parse a hint in group by. I will probably also need to add one. was (Author: amaliujia): [~danny0405] first of all, thanks for your detailed written [Hint doc|https://docs.google.com/document/d/1mykz-w2t1Yw7CH6NjUWpWqCAf_6YNKxSc59gXafrNCs/edit#]. I learned a lot from it. {quote}define a hint keyword (We may some reference with other engines to give it a readable name){quote} I think you are saying if there is a better name or a common name used by other engines for "hot_key"? {quote}define the hint options(either a simple identifier list or k-v list), need to make the options as much common to use{quote} This is a tricky one. I see that there is a support for kv list where key is SqlIdentifier and value is just a string literal. In the hot key case, we will need a list of kv pair where key should be a literal. It is because groupby keys can have different types so the value should have different type. So my current thought is adding support of list of kv pair: Literal:Literal, as a hint's body. What do you think? {quote}define the hint strategy for the hint item (see if you need some special conditions for the relational expressions){quote} Agreed. In [HintStrategies|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/hint/HintStrategies.java#L22], there is no a strategy for aggregation, I probably should add one. Lastly, in parser I believe there is no support to parse a hint in group by. I will probably also need to add one. > Support hot Groupby keys hint > - > > Key: CALCITE-3594 > URL: https://issues.apache.org/jira/browse/CALCITE-3594 > Project: Calcite > Issue Type: Sub-task >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > > It will be useful for Apache Beam if we support the following SqlHint: > SELECT * FROM t > GROUP BY t.key_column /* + hot_key(key1=fanout_factor, ...) */) > The hot key strategy works on aggregation and it provides a list of hot keys > with fanout factor for a column. The fanout factor says how many partition > should be created for that specific key, such that we can have a per > partition aggregate and then have a final aggregate. One example to explain > it: > SELECT * FROM t > GROUP BY t.key_column /* + hot_key("value1"=2) */) > // for the key_column, there is a "value1" which appear so many times (so > it's hot), please consider split it into two partition and process separately. > Such problem is common for big data processing, where hot key creates slowest > machine which either slow down the whole pipeline or make retries. In such > case, one common resolution is to split data to multiple partition and > aggregate per partition, and then have a final combine. > Usually execution engine won't know what is the hot key(s). SqlHint provides > a good way to tell the engine which key is useful to deal with it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3587) RexBuilder may lose decimal fraction for creating literal with DECIMAL type
[ https://issues.apache.org/jira/browse/CALCITE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996117#comment-16996117 ] Rui Wang commented on CALCITE-3587: --- [~danny0405] I think the [method|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L1363] that Yanlin is trying to fix is public (that's why it's used in unit test). Maybe you were looking into a different protected method? > RexBuilder may lose decimal fraction for creating literal with DECIMAL type > --- > > Key: CALCITE-3587 > URL: https://issues.apache.org/jira/browse/CALCITE-3587 > Project: Calcite > Issue Type: Bug >Reporter: Wang Yanlin >Assignee: Wang Yanlin >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > this test > {code:java} > // RexBuilderTest > @Test public void testDecimal() { > final RelDataTypeFactory typeFactory = > new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT); > final RelDataType type = typeFactory.createSqlType(SqlTypeName.DECIMAL, 4, 2); > final RexBuilder builder = new RexBuilder(typeFactory); > final RexLiteral literal = (RexLiteral) builder.makeLiteral(12.3, type, > false); > Comparable value = literal.getValue(); > assertThat(value.toString(), is("12.3")); > } > {code} > fails with message > {code:java} > java.lang.AssertionError: > Expected: is "12.3" > but: was "12" > Expected :12.3 > Actual :12 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996108#comment-16996108 ] Chunwei Lei edited comment on CALCITE-3387 at 12/14/19 3:16 AM: Thanks for your review, [~julianhyde]. On one hand, the behavior is kind of the same with {{expandStar}} in which common columns will be rewrited to {{COALESCE(...)}}[1]. On the other hand, I think it makes sense to rewrite it to COALESCE since the columns will be formatted to fully-qualify columns during validation. [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6475] was (Author: chunwei lei): Thanks for your review, [~julianhyde]. On one hand, the behavior is kind of the same with {{expandStar}} in which common columns will be rewrited to {{COALESCE(...)}}[1]. On the other hand, I think it makes sense to rewrite it to COALESCE since the columns will be formatted to fully-qualify columns during validation. [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6475] \{{}} > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996108#comment-16996108 ] Chunwei Lei commented on CALCITE-3387: -- Thanks for your review, [~julianhyde]. On one hand, the behavior is kind of the same with {{expandStar}} in which common columns will be rewrited to {{COALESCE(...)}}[1]. On the other hand, I think it makes sense to rewrite it to COALESCE since the columns will be formatted to fully-qualify columns during validation. [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6475] {{}} > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996108#comment-16996108 ] Chunwei Lei edited comment on CALCITE-3387 at 12/14/19 3:15 AM: Thanks for your review, [~julianhyde]. On one hand, the behavior is kind of the same with {{expandStar}} in which common columns will be rewrited to {{COALESCE(...)}}[1]. On the other hand, I think it makes sense to rewrite it to COALESCE since the columns will be formatted to fully-qualify columns during validation. [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6475] \{{}} was (Author: chunwei lei): Thanks for your review, [~julianhyde]. On one hand, the behavior is kind of the same with {{expandStar}} in which common columns will be rewrited to {{COALESCE(...)}}[1]. On the other hand, I think it makes sense to rewrite it to COALESCE since the columns will be formatted to fully-qualify columns during validation. [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6475] {{}} > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3587) RexBuilder may lose decimal fraction for creating literal with DECIMAL type
[ https://issues.apache.org/jira/browse/CALCITE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996072#comment-16996072 ] Julian Hyde commented on CALCITE-3587: -- I see now that {{clean}} does not throw when given a Java {{float}} value for a SQL {{DECIMAL}} literal. Do we think that it should? I am inclined to say yes. > RexBuilder may lose decimal fraction for creating literal with DECIMAL type > --- > > Key: CALCITE-3587 > URL: https://issues.apache.org/jira/browse/CALCITE-3587 > Project: Calcite > Issue Type: Bug >Reporter: Wang Yanlin >Assignee: Wang Yanlin >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > this test > {code:java} > // RexBuilderTest > @Test public void testDecimal() { > final RelDataTypeFactory typeFactory = > new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT); > final RelDataType type = typeFactory.createSqlType(SqlTypeName.DECIMAL, 4, 2); > final RexBuilder builder = new RexBuilder(typeFactory); > final RexLiteral literal = (RexLiteral) builder.makeLiteral(12.3, type, > false); > Comparable value = literal.getValue(); > assertThat(value.toString(), is("12.3")); > } > {code} > fails with message > {code:java} > java.lang.AssertionError: > Expected: is "12.3" > but: was "12" > Expected :12.3 > Actual :12 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996069#comment-16996069 ] Julian Hyde commented on CALCITE-3387: -- The fix looks OK. But it's better to avoid modifying the SqlNode tree as we validate it. Would it be possible to keep the SqlNode tree the same and perform the necessary rewrite during Sql-to-rel conversion? > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3587) RexBuilder may lose decimal fraction for creating literal with DECIMAL type
[ https://issues.apache.org/jira/browse/CALCITE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996068#comment-16996068 ] Julian Hyde commented on CALCITE-3587: -- [~danny0405] is correct. You should call {{makeExactLiteral}}. I guess when you called {{makeLiteral(Object, RelDataType, boolean))}} it gave an error. If so, the error was correct. You should not call it with a Java {{float}} value for a SQL {{DECIMAL}} type. Java {{float}} is inexact - it probably has value 12.28 or something - and therefore the correct type for a {{DECIMAL}} literal is a {{java.math.BigDecimal}}. If you change your test from {{builder.makeLiteral(12.3, type, false)}} to {{builder.makeExactLiteral(new BigDecimal("12.3"), type)}} I think it will succeed. > RexBuilder may lose decimal fraction for creating literal with DECIMAL type > --- > > Key: CALCITE-3587 > URL: https://issues.apache.org/jira/browse/CALCITE-3587 > Project: Calcite > Issue Type: Bug >Reporter: Wang Yanlin >Assignee: Wang Yanlin >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > this test > {code:java} > // RexBuilderTest > @Test public void testDecimal() { > final RelDataTypeFactory typeFactory = > new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT); > final RelDataType type = typeFactory.createSqlType(SqlTypeName.DECIMAL, 4, 2); > final RexBuilder builder = new RexBuilder(typeFactory); > final RexLiteral literal = (RexLiteral) builder.makeLiteral(12.3, type, > false); > Comparable value = literal.getValue(); > assertThat(value.toString(), is("12.3")); > } > {code} > fails with message > {code:java} > java.lang.AssertionError: > Expected: is "12.3" > but: was "12" > Expected :12.3 > Actual :12 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995646#comment-16995646 ] Chunwei Lei commented on CALCITE-3387: -- I opened a pull request: [https://github.com/apache/calcite/pull/1655]. > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CALCITE-3599) Initial the digest of RexRangeRef to avoid null string
[ https://issues.apache.org/jira/browse/CALCITE-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunwei Lei resolved CALCITE-3599. -- Fix Version/s: 1.22.0 Resolution: Fixed Fixed in [https://github.com/apache/calcite/commit/8e9261dc14f1e50563b259dcdb42cd4415065078]. Thanks for your reviews, [~julianhyde], [~amaliujia], and [~danny0405]. > Initial the digest of RexRangeRef to avoid null string > -- > > Key: CALCITE-3599 > URL: https://issues.apache.org/jira/browse/CALCITE-3599 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Chunwei Lei >Assignee: Chunwei Lei >Priority: Minor > Labels: pull-request-available > Fix For: 1.22.0 > > Attachments: image-2019-12-12-23-49-18-977.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the digest of {{RexRangeRef}} is always {{null}} which is > confusing when we try to debug the code. I suggest changing it to a more > meaningful string such as {{offset(0)}}. > !image-2019-12-12-23-49-18-977.png|width=529,height=234! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CALCITE-3387: Labels: pull-request-available (was: ) > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunwei Lei reassigned CALCITE-3387: Assignee: Chunwei Lei > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3599) Initial the digest of RexRangeRef to avoid null string
[ https://issues.apache.org/jira/browse/CALCITE-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995570#comment-16995570 ] Chunwei Lei commented on CALCITE-3599: -- Good idea. PR was updated. > Initial the digest of RexRangeRef to avoid null string > -- > > Key: CALCITE-3599 > URL: https://issues.apache.org/jira/browse/CALCITE-3599 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Chunwei Lei >Assignee: Chunwei Lei >Priority: Minor > Labels: pull-request-available > Attachments: image-2019-12-12-23-49-18-977.png > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, the digest of {{RexRangeRef}} is always {{null}} which is > confusing when we try to debug the code. I suggest changing it to a more > meaningful string such as {{offset(0)}}. > !image-2019-12-12-23-49-18-977.png|width=529,height=234! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3600) Rule to solve the filter partially by end application and remaining by calcite
[ https://issues.apache.org/jira/browse/CALCITE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995481#comment-16995481 ] Stamatis Zampetakis commented on CALCITE-3600: -- Another API that seems relevant for this use-case is [FilterableTable|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/schema/FilterableTable.java]. > Rule to solve the filter partially by end application and remaining by calcite > -- > > Key: CALCITE-3600 > URL: https://issues.apache.org/jira/browse/CALCITE-3600 > Project: Calcite > Issue Type: Wish >Reporter: anjali shrishrimal >Priority: Minor > > Add Rule to check if filter condition is solvable by end application. If part > of the filter condition can be solved by end application, then it should get > pushed to end application, and remaining part which can not be solved by end > application, should get solved by calcite secondarily (i.e. upon fetch remove > unwanted data as per filter condition) > > Consider an application which can solve only limited operators while > filtering, say "=,<,>" and can not solve operator 'LIKE'. > > Example, filter condition is "id > 1000 AND name LIKE '%an%'" > > we would like to restrict the condition passed to application to "id > 1000" > and remaining part "name LIKE '%an%'" should get solved by calcite. (The way > it does for csv-adapter) > > To replicate the situation, consider test-case testFilter in MongoAdapterTest > (org.apache.calcite.adapter.mongodb.MongoAdapterTest) of mongo-adapter. > And modify it like below: > > @Test public void testFilter() > { assertModel(MODEL) .query("select state, city from zips where state = 'CA' > AND city LIKE '%E%'") .returnsUnordered("STATE=CA; CITY=LOS ANGELES", > "STATE=CA; CITY=BELL GARDENS"); } > > > Expected output of above query : > STATE=CA; CITY=LOS ANGELES, > STATE=CA; CITY=BELL GARDENS > > Expected plan : > EnumerableFilter(condition=[LIKE(CAST(ITEM($0, 'city')):VARCHAR(20), '%E%')]) > {{MongoToEnumerableConverter}} > {{MongoProject(STATE=[CAST(ITEM($0, 'state')):VARCHAR(2)], > CITY=[CAST(ITEM($0, 'city')):VARCHAR(20)])}} > {{MongoFilter(condition=[=(CAST(ITEM($0, 'state')):VARCHAR(2), 'CA')])}} > {{MongoTableScan(table=[[mongo_raw, zips]])}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3600) Rule to solve the filter partially by end application and remaining by calcite
[ https://issues.apache.org/jira/browse/CALCITE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995478#comment-16995478 ] Stamatis Zampetakis commented on CALCITE-3600: -- Calcite already offers a high level API to achieve this by sub-classing the [CalcRelSplitter|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/CalcRelSplitter.java]. Have you checked this class? Another issue that I think is relevant is CALCITE-3541. > Rule to solve the filter partially by end application and remaining by calcite > -- > > Key: CALCITE-3600 > URL: https://issues.apache.org/jira/browse/CALCITE-3600 > Project: Calcite > Issue Type: Wish >Reporter: anjali shrishrimal >Priority: Minor > > Add Rule to check if filter condition is solvable by end application. If part > of the filter condition can be solved by end application, then it should get > pushed to end application, and remaining part which can not be solved by end > application, should get solved by calcite secondarily (i.e. upon fetch remove > unwanted data as per filter condition) > > Consider an application which can solve only limited operators while > filtering, say "=,<,>" and can not solve operator 'LIKE'. > > Example, filter condition is "id > 1000 AND name LIKE '%an%'" > > we would like to restrict the condition passed to application to "id > 1000" > and remaining part "name LIKE '%an%'" should get solved by calcite. (The way > it does for csv-adapter) > > To replicate the situation, consider test-case testFilter in MongoAdapterTest > (org.apache.calcite.adapter.mongodb.MongoAdapterTest) of mongo-adapter. > And modify it like below: > > @Test public void testFilter() > { assertModel(MODEL) .query("select state, city from zips where state = 'CA' > AND city LIKE '%E%'") .returnsUnordered("STATE=CA; CITY=LOS ANGELES", > "STATE=CA; CITY=BELL GARDENS"); } > > > Expected output of above query : > STATE=CA; CITY=LOS ANGELES, > STATE=CA; CITY=BELL GARDENS > > Expected plan : > EnumerableFilter(condition=[LIKE(CAST(ITEM($0, 'city')):VARCHAR(20), '%E%')]) > {{MongoToEnumerableConverter}} > {{MongoProject(STATE=[CAST(ITEM($0, 'state')):VARCHAR(2)], > CITY=[CAST(ITEM($0, 'city')):VARCHAR(20)])}} > {{MongoFilter(condition=[=(CAST(ITEM($0, 'state')):VARCHAR(2), 'CA')])}} > {{MongoTableScan(table=[[mongo_raw, zips]])}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3587) RexBuilder may lose decimal fraction for creating literal with DECIMAL type
[ https://issues.apache.org/jira/browse/CALCITE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995452#comment-16995452 ] Danny Chen commented on CALCITE-3587: - Did you notice that there is a method "RexBuilder#makeExactLiteral(BigDecimal" [1], and the method you use is protect scope which means it is not a public API ? [1] https://github.com/apache/calcite/blob/28d5f6f4ad770cd505ba0e5506b7559879600b4e/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L979 > RexBuilder may lose decimal fraction for creating literal with DECIMAL type > --- > > Key: CALCITE-3587 > URL: https://issues.apache.org/jira/browse/CALCITE-3587 > Project: Calcite > Issue Type: Bug >Reporter: Wang Yanlin >Assignee: Wang Yanlin >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > this test > {code:java} > // RexBuilderTest > @Test public void testDecimal() { > final RelDataTypeFactory typeFactory = > new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT); > final RelDataType type = typeFactory.createSqlType(SqlTypeName.DECIMAL, 4, 2); > final RexBuilder builder = new RexBuilder(typeFactory); > final RexLiteral literal = (RexLiteral) builder.makeLiteral(12.3, type, > false); > Comparable value = literal.getValue(); > assertThat(value.toString(), is("12.3")); > } > {code} > fails with message > {code:java} > java.lang.AssertionError: > Expected: is "12.3" > but: was "12" > Expected :12.3 > Actual :12 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3600) Rule to solve the filter partially by end application and remaining by calcite
[ https://issues.apache.org/jira/browse/CALCITE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995451#comment-16995451 ] Rui Wang commented on CALCITE-3600: --- I think in BeamSQL we had same problem, due to the reason that underlying source read API may not support all filter conditions. As a beginning to solve it, we thought at least need to implement a RexCall splitter for specific sources to split filter conditions to two conditions, which was a hard problem. Think about A OR B that A is a valid condition but B is not but we cannot split it (cannot lose the OR to move only A to the source). But it can be split for A AND B. It will also become complicated for nested condition. > Rule to solve the filter partially by end application and remaining by calcite > -- > > Key: CALCITE-3600 > URL: https://issues.apache.org/jira/browse/CALCITE-3600 > Project: Calcite > Issue Type: Wish >Reporter: anjali shrishrimal >Priority: Minor > > Add Rule to check if filter condition is solvable by end application. If part > of the filter condition can be solved by end application, then it should get > pushed to end application, and remaining part which can not be solved by end > application, should get solved by calcite secondarily (i.e. upon fetch remove > unwanted data as per filter condition) > > Consider an application which can solve only limited operators while > filtering, say "=,<,>" and can not solve operator 'LIKE'. > > Example, filter condition is "id > 1000 AND name LIKE '%an%'" > > we would like to restrict the condition passed to application to "id > 1000" > and remaining part "name LIKE '%an%'" should get solved by calcite. (The way > it does for csv-adapter) > > To replicate the situation, consider test-case testFilter in MongoAdapterTest > (org.apache.calcite.adapter.mongodb.MongoAdapterTest) of mongo-adapter. > And modify it like below: > > @Test public void testFilter() > { assertModel(MODEL) .query("select state, city from zips where state = 'CA' > AND city LIKE '%E%'") .returnsUnordered("STATE=CA; CITY=LOS ANGELES", > "STATE=CA; CITY=BELL GARDENS"); } > > > Expected output of above query : > STATE=CA; CITY=LOS ANGELES, > STATE=CA; CITY=BELL GARDENS > > Expected plan : > EnumerableFilter(condition=[LIKE(CAST(ITEM($0, 'city')):VARCHAR(20), '%E%')]) > {{MongoToEnumerableConverter}} > {{MongoProject(STATE=[CAST(ITEM($0, 'state')):VARCHAR(2)], > CITY=[CAST(ITEM($0, 'city')):VARCHAR(20)])}} > {{MongoFilter(condition=[=(CAST(ITEM($0, 'state')):VARCHAR(2), 'CA')])}} > {{MongoTableScan(table=[[mongo_raw, zips]])}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3600) Rule to solve the filter partially by end application and remaining by calcite
[ https://issues.apache.org/jira/browse/CALCITE-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995451#comment-16995451 ] Rui Wang edited comment on CALCITE-3600 at 12/13/19 8:15 AM: - I think in BeamSQL we had same problem, due to the reason that underlying source read API may not support all filter conditions. As a beginning to solve it, we thought at least need to implement a RexCall splitter for specific sources to split filter conditions to two conditions(one can be executed by source and another cannot), which was a hard problem. Think about A OR B that A is a valid condition but B is not but we cannot split it (cannot lose the OR to move only A to the source). But it can be split for A AND B. It will also become complicated for nested condition. was (Author: amaliujia): I think in BeamSQL we had same problem, due to the reason that underlying source read API may not support all filter conditions. As a beginning to solve it, we thought at least need to implement a RexCall splitter for specific sources to split filter conditions to two conditions, which was a hard problem. Think about A OR B that A is a valid condition but B is not but we cannot split it (cannot lose the OR to move only A to the source). But it can be split for A AND B. It will also become complicated for nested condition. > Rule to solve the filter partially by end application and remaining by calcite > -- > > Key: CALCITE-3600 > URL: https://issues.apache.org/jira/browse/CALCITE-3600 > Project: Calcite > Issue Type: Wish >Reporter: anjali shrishrimal >Priority: Minor > > Add Rule to check if filter condition is solvable by end application. If part > of the filter condition can be solved by end application, then it should get > pushed to end application, and remaining part which can not be solved by end > application, should get solved by calcite secondarily (i.e. upon fetch remove > unwanted data as per filter condition) > > Consider an application which can solve only limited operators while > filtering, say "=,<,>" and can not solve operator 'LIKE'. > > Example, filter condition is "id > 1000 AND name LIKE '%an%'" > > we would like to restrict the condition passed to application to "id > 1000" > and remaining part "name LIKE '%an%'" should get solved by calcite. (The way > it does for csv-adapter) > > To replicate the situation, consider test-case testFilter in MongoAdapterTest > (org.apache.calcite.adapter.mongodb.MongoAdapterTest) of mongo-adapter. > And modify it like below: > > @Test public void testFilter() > { assertModel(MODEL) .query("select state, city from zips where state = 'CA' > AND city LIKE '%E%'") .returnsUnordered("STATE=CA; CITY=LOS ANGELES", > "STATE=CA; CITY=BELL GARDENS"); } > > > Expected output of above query : > STATE=CA; CITY=LOS ANGELES, > STATE=CA; CITY=BELL GARDENS > > Expected plan : > EnumerableFilter(condition=[LIKE(CAST(ITEM($0, 'city')):VARCHAR(20), '%E%')]) > {{MongoToEnumerableConverter}} > {{MongoProject(STATE=[CAST(ITEM($0, 'state')):VARCHAR(2)], > CITY=[CAST(ITEM($0, 'city')):VARCHAR(20)])}} > {{MongoFilter(condition=[=(CAST(ITEM($0, 'state')):VARCHAR(2), 'CA')])}} > {{MongoTableScan(table=[[mongo_raw, zips]])}} -- This message was sent by Atlassian Jira (v8.3.4#803005)