[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table
[ https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706188#comment-16706188 ] Hongze Zhang edited comment on CALCITE-2670 at 12/2/18 11:01 AM: - [~julianhyde] Sorry for pinging you again - I know you are busy on the new release. But do you think whether we could bring this to release 1.18 too (along with CALCITE-2266) {color:#33}?{color} So that users need not to change their implementation in future versions if they implemented JSON_ARRAYAGG or JSON_OBJECTAGG on top of 1.18. was (Author: zhztheplayer): [~julianhyde] Sorry for pinging you again - I know you are busy on the new release. But do you think whether we could bring this to release 1.18 too (along with CALCITE-2266) {color:#33}?{color} So that users need not to change their implementation in future versions if they implemented JSON_ARRAYAGG on top of 1.18. > Combine similar JSON aggregate functions in operator table > -- > > Key: CALCITE-2670 > URL: https://issues.apache.org/jira/browse/CALCITE-2670 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Hongze Zhang >Assignee: Julian Hyde >Priority: Major > > In current code master of Calcite, operator *JSON_ARRAYAGG* and > *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, > *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and > *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for > now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* > and *JSON_OBJECTAGG* and on implementing logical plan by adapters. > This is basically to combine the 4 operators to 2, this is a improvement list: > # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;* > # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;* > # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass* This is to > generate *Enum* java type for *SYMBOL;* > # Add SQL-to-Rel test cases for JSON functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table
[ https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691560#comment-16691560 ] Hongze Zhang edited comment on CALCITE-2670 at 11/19/18 11:08 AM: -- {quote}If it's data, it has to follow the SQL type system, and of course you can't figure out a more specific type than Enum. But should it be data? I think the argument should be hard-coded into the call. {quote} Thanks [~julianhyde] and I agree that hard-coded argument is better. But I am not sure if there is any proper way to pass a non-data argument to an aggregate call without modifing massive code when implementing the plan. Now AggregateCall supports integer input ref list as the argument list, rather than a rex node list. Which means, if we make each flag an argument, the argment will be convert to data by the implementor. If there is any good way to avoid this please let me know. Now Calcite's JSON_OBJECTAGG has only one flag - the error behavior, but the function has 2 flags in SQL standard. If using different operators to represent the same function with different flags, when flag count is up to 2, 3 ,4 or more, registered operator count will be a lot. (e.g. JSON_OBJECTAGG_NULL_ON_NULL_WITH_UNIQUE_KEY, JSON_OBJECTAGG_NULL_ON_NULL_WITHOUT_UNIQUE_KEY, JSON_OBJECTAGG_ERROR_ON_NULL_WITH_UNIQUE_KEY... ), so I believe it is neccesary to combine such operators. If leaving them separated is OK, please also let me know. was (Author: zhztheplayer): bq. If it's data, it has to follow the SQL type system, and of course you can't figure out a more specific type than Enum. But should it be data? I think the argument should be hard-coded into the call. Thanks [~julianhyde] and I agree that hard-coded argument is better. But I am not sure if there is any proper way to pass a non-data argument to an aggregate call without modifing massive code when implementing the plan. Now AggregateCall supports integer input ref list as the argument list, rather than a rex node list. Which means, if we make each flag an argument, the argment will be convert to data by the implementor. If there is any good way to avoid this please let me know. Now Calcite's JSON_OBJECTAGG has only one flag - the error behavior, but the function has 2 flags in SQL standard. If using different operators to represent the same function with different flags, when flag count is up to 2, 3 ,4 or more, registered operator count will be a lot. (e.g. JSON_OBJECTAGG_NULL_ON_NULL_WITH_UNIQUE_KEY, JSON_OBJECTAGG_NULL_ON_NULL_WITHOUT_UNIQUE_KEY, JSON_OBJECTAGG_ERROR_ON_NULL_WITH_UNIQUE_KEY... ), so I believe it is neccesary to combine such operators. > Combine similar JSON aggregate functions in operator table > -- > > Key: CALCITE-2670 > URL: https://issues.apache.org/jira/browse/CALCITE-2670 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Hongze Zhang >Assignee: Julian Hyde >Priority: Major > > In current code master of Calcite, operator *JSON_ARRAYAGG* and > *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, > *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and > *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for > now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* > and *JSON_OBJECTAGG* and on implementing logical plan by adapters. > This is basically to combine the 4 operators to 2, this is a improvement list: > # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;* > # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;* > # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass* > This is to generate *Enum* java type for *SYMBOL*. Current version of > Calcite generates *Object[]*, which delivers type casting error, or some > method compatible problems {color:#33}when Calcite tries to pass enum > parameters to an aggregate function;{color} > # Add SQL-to-Rel test cases for JSON functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table
[ https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688817#comment-16688817 ] Hongze Zhang edited comment on CALCITE-2670 at 11/16/18 6:58 AM: - bq. Why did you convert some parameters from particular types (e.g. SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum? Thanks [~julianhyde] and I missed explaining this point. Everything works before because there were methods like SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods and operators in the patch. In the patch, I have added SYMBOL-to-Enum type transform to JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does not know the original type of a SYMBOL. And rel implementer applies explicit type casting when translating SYMBOL parameter that is from an aggregate call (The enum flag is actually an *INPUT_REF* in aggregate call layer, so casting is needed), so if we transform SYMBOL to Enum in JavaTypeFactoryImpl#getJavaClass, we can only use Enum but not particular type for the related methods in SqlFunctions.java. And for non-aggregate operators like *JSON_QUERY*, I believe the argument Type needs to be changed to *Enum* too. In RexToLixTranslator, an enum flag inside an JSON_QUERY rex call is actually showed as a *LOCAL_REF*, which is to be converted to ConstantExpression, which does not need a type casting when implementing the outer method call, so we see non errors even we define a paticular type in SqlFunctions.java. However this is not robust - any operation (e.g. a rule) to extract the enum flag to be a *INPUT_REF* will cause type compatible error. Finally, SYMBOL-to-Enum type casting is a trade-off - if we use a JavaRecordType to carry a particular enum type instead of using BasicSqlType, we could use particular types in SqlFunctions.java. But this way needs more code changes, and JavaRecordType is still experimental and not recommended for use. was (Author: zhztheplayer): bq. Why did you convert some parameters from particular types (e.g. SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum? Thanks [~julianhyde] and I missed explaining this point. In the patch, I have added SYMBOL-to-Enum type transform to JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does not know the original type of a SYMBOL. And rel implementer applies explicitly type casting when translating SYMBOL parameter that is from an aggregate call, so if we transform SYMBOL to Enum in JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods in SqlFunctions.java. It works before because there were methods like SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch. And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to carry a particular enum type instead of using BasicSqlType, we could use particular types in SqlFunctions.java. But this way needs more code changes, and JavaRecordType is still experimental and not recommended for use. > Combine similar JSON aggregate functions in operator table > -- > > Key: CALCITE-2670 > URL: https://issues.apache.org/jira/browse/CALCITE-2670 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Hongze Zhang >Assignee: Julian Hyde >Priority: Major > > In current code master of Calcite, operator *JSON_ARRAYAGG* and > *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, > *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and > *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for > now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* > and *JSON_OBJECTAGG* and on implementing logical plan by adapters. > This is basically to combine the 4 operators to 2, this is a improvement list: > # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;* > # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;* > # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass* > This is to generate *Enum* java type for *SYMBOL*. Current version of > Calcite generates *Object[]*, which delivers type casting error, or some > method compatible problems {color:#33}when Calcite tries to pass enum > parameters to an aggregate function;{color} > # Add SQL-to-Rel test cases for JSON functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table
[ https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688817#comment-16688817 ] Hongze Zhang edited comment on CALCITE-2670 at 11/16/18 12:06 AM: -- bq. Why did you convert some parameters from particular types (e.g. SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum? Thanks [~julianhyde] and I missed explaining this point. In the patch, I have added SYMBOL-to-Enum type transform to JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does not know the original type of a SYMBOL. And rel implementer applies explicitly type casting when translating SYMBOL parameter that is from an aggregate call, so if we transform SYMBOL to Enum in JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods in SqlFunctions.java. It works before because there were methods like SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch. And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to carry a particular enum type instead of using BasicSqlType, we could use particular types in SqlFunctions.java. But this way needs more code changes, and JavaRecordType is still experimental and not recommended for use. was (Author: zhztheplayer): bq. Why did you convert some parameters from particular types (e.g. SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum? Thanks [~julianhyde] and I missed explaining this point. In the patch, I have added SYMBOL-to-Enum type transform to JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does not know the original type of a SYMBOL. And rel implementer applies explicitly type casting when translating SYMBOL parameter that is from an aggregate call, so if we transform SYMBOL to Enum in JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods in SqlFunctions.java. It works before because there were methods like SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch. And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to carry a particular enum type instead of using RelDataType, we could use particular types in SqlFunctions.java. But this way needs more code changes, and JavaRecordType is still experimental and not recommended for use. > Combine similar JSON aggregate functions in operator table > -- > > Key: CALCITE-2670 > URL: https://issues.apache.org/jira/browse/CALCITE-2670 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Hongze Zhang >Assignee: Julian Hyde >Priority: Major > > In current code master of Calcite, operator *JSON_ARRAYAGG* and > *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, > *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and > *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for > now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* > and *JSON_OBJECTAGG* and on implementing logical plan by adapters. > This is basically to combine the 4 operators to 2, this is a improvement list: > # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;* > # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;* > # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass* > This is to generate *Enum* java type for *SYMBOL*. Current version of > Calcite generates *Object[]*, which delivers type casting error, or some > method compatible problems {color:#33}when Calcite tries to pass enum > parameters to an aggregate function;{color} > # Add SQL-to-Rel test cases for JSON functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)