[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table

2018-12-02 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16706188#comment-16706188
 ] 

Hongze Zhang edited comment on CALCITE-2670 at 12/2/18 11:01 AM:
-

[~julianhyde] Sorry for pinging you again - I know you are busy on the new 
release. But do you think whether we could bring this to release 1.18 too 
(along with CALCITE-2266) {color:#33}?{color} So that users need not to 
change their implementation in future versions if they implemented 
JSON_ARRAYAGG or JSON_OBJECTAGG on top of 1.18.


was (Author: zhztheplayer):
[~julianhyde] Sorry for pinging you again - I know you are busy on the new 
release. But do you think whether we could bring this to release 1.18 too 
(along with CALCITE-2266) {color:#33}?{color} So that users need not to 
change their implementation in future versions if they implemented 
JSON_ARRAYAGG on top of 1.18.

> Combine similar JSON aggregate functions in operator table
> --
>
> Key: CALCITE-2670
> URL: https://issues.apache.org/jira/browse/CALCITE-2670
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Hongze Zhang
>Assignee: Julian Hyde
>Priority: Major
>
> In current code master of Calcite, operator *JSON_ARRAYAGG* and 
> *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, 
> *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and 
> *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for 
> now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* 
> and *JSON_OBJECTAGG* and on implementing logical plan by adapters.
> This is basically to combine the 4 operators to 2, this is a improvement list:
>  # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;*
>  # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;*
>  # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass* This is to 
> generate *Enum* java type for *SYMBOL;*
>  # Add SQL-to-Rel test cases for JSON functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table

2018-11-19 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691560#comment-16691560
 ] 

Hongze Zhang edited comment on CALCITE-2670 at 11/19/18 11:08 AM:
--

{quote}If it's data, it has to follow the SQL type system, and of course you 
can't figure out a more specific type than Enum. But should it be data? I think 
the argument should be hard-coded into the call.
{quote}
Thanks [~julianhyde] and I agree that hard-coded argument is better. But I am 
not sure if there is any proper way to pass a non-data argument to an aggregate 
call without modifing massive code when implementing the plan. Now 
AggregateCall supports integer input ref list as the argument list, rather than 
a rex node list. Which means, if we make each flag an argument, the argment 
will be convert to data by the implementor. If there is any good way to avoid 
this please let me know.

Now Calcite's JSON_OBJECTAGG has only one flag - the error behavior, but the 
function has 2 flags in SQL standard. If using different operators to represent 
the same function with different flags, when flag count is up to 2, 3 ,4 or 
more, registered operator count will be a lot. (e.g. 
JSON_OBJECTAGG_NULL_ON_NULL_WITH_UNIQUE_KEY, 
JSON_OBJECTAGG_NULL_ON_NULL_WITHOUT_UNIQUE_KEY, 
JSON_OBJECTAGG_ERROR_ON_NULL_WITH_UNIQUE_KEY... ), so I believe it is neccesary 
to combine such operators. If leaving them separated is OK, please also let me 
know.


was (Author: zhztheplayer):
bq. If it's data, it has to follow the SQL type system, and of course you can't 
figure out a more specific type than Enum. But should it be data? I think the 
argument should be hard-coded into the call.

Thanks [~julianhyde] and I agree that hard-coded argument is better. But I am 
not sure if there is any proper way to pass a non-data argument to an aggregate 
call without modifing massive code when implementing the plan. Now 
AggregateCall supports integer input ref list as the argument list, rather than 
a rex node list. Which means, if we make each flag an argument, the argment 
will be convert to data by the implementor. If there is any good way to avoid 
this please let me know.

Now Calcite's JSON_OBJECTAGG has only one flag - the error behavior, but the 
function has 2 flags in SQL standard. If using different operators to represent 
the same function with different flags, when flag count is up to 2, 3 ,4 or 
more, registered operator count will be a lot. (e.g. 
JSON_OBJECTAGG_NULL_ON_NULL_WITH_UNIQUE_KEY, 
JSON_OBJECTAGG_NULL_ON_NULL_WITHOUT_UNIQUE_KEY, 
JSON_OBJECTAGG_ERROR_ON_NULL_WITH_UNIQUE_KEY... ), so I believe it is neccesary 
to combine such operators.

> Combine similar JSON aggregate functions in operator table
> --
>
> Key: CALCITE-2670
> URL: https://issues.apache.org/jira/browse/CALCITE-2670
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Hongze Zhang
>Assignee: Julian Hyde
>Priority: Major
>
> In current code master of Calcite, operator *JSON_ARRAYAGG* and 
> *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, 
> *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and 
> *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for 
> now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* 
> and *JSON_OBJECTAGG* and on implementing logical plan by adapters.
> This is basically to combine the 4 operators to 2, this is a improvement list:
>  # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;*
>  # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;*
>  # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass*
>  This is to generate *Enum* java type for *SYMBOL*. Current version of 
> Calcite generates *Object[]*, which delivers type casting error, or some 
> method compatible problems {color:#33}when Calcite tries to pass enum 
> parameters to an aggregate function;{color}
>  # Add SQL-to-Rel test cases for JSON functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table

2018-11-15 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688817#comment-16688817
 ] 

Hongze Zhang edited comment on CALCITE-2670 at 11/16/18 6:58 AM:
-

bq. Why did you convert some parameters from particular types (e.g. 
SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum?

Thanks [~julianhyde] and I missed explaining this point.

Everything works before because there were methods like 
SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like 
JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods and operators in the 
patch.

In the patch, I have added SYMBOL-to-Enum type transform to 
JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does 
not know the original type of a SYMBOL.
And rel implementer applies explicit type casting when translating SYMBOL 
parameter that is from an aggregate call (The enum flag is actually an 
*INPUT_REF* in aggregate call layer, so casting is needed), so if we transform 
SYMBOL to Enum in JavaTypeFactoryImpl#getJavaClass, we can only use Enum but 
not particular type for the related methods in SqlFunctions.java.

And for non-aggregate operators like *JSON_QUERY*, I believe the argument Type 
needs to be changed to *Enum* too. In RexToLixTranslator, an enum flag inside 
an JSON_QUERY rex call is actually showed as a *LOCAL_REF*, which is to be 
converted to ConstantExpression, which does not need a type casting when 
implementing the outer method call, so we see non errors even we define a 
paticular type in SqlFunctions.java. However this is not robust - any operation 
(e.g. a rule) to extract the enum flag to be a *INPUT_REF* will cause type 
compatible error.

Finally, SYMBOL-to-Enum type casting is a trade-off - if we use a 
JavaRecordType to carry a particular enum type instead of using BasicSqlType, 
we could use particular types in SqlFunctions.java. But this way needs more 
code changes, and JavaRecordType is still experimental and not recommended for 
use.



was (Author: zhztheplayer):
bq. Why did you convert some parameters from particular types (e.g. 
SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum?

Thanks [~julianhyde] and I missed explaining this point.

In the patch, I have added SYMBOL-to-Enum type transform to 
JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does 
not know the original type of a SYMBOL.
And rel implementer applies explicitly type casting when translating SYMBOL 
parameter that is from an aggregate call, so if we transform SYMBOL to Enum in 
JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods 
in SqlFunctions.java.

It works before because there were methods like 
SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like 
JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch.

And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to 
carry a particular enum type instead of using BasicSqlType, we could use 
particular types in SqlFunctions.java. But this way needs more code changes, 
and JavaRecordType is still experimental and not recommended for use.


> Combine similar JSON aggregate functions in operator table
> --
>
> Key: CALCITE-2670
> URL: https://issues.apache.org/jira/browse/CALCITE-2670
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Hongze Zhang
>Assignee: Julian Hyde
>Priority: Major
>
> In current code master of Calcite, operator *JSON_ARRAYAGG* and 
> *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, 
> *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and 
> *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for 
> now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* 
> and *JSON_OBJECTAGG* and on implementing logical plan by adapters.
> This is basically to combine the 4 operators to 2, this is a improvement list:
>  # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;*
>  # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;*
>  # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass*
>  This is to generate *Enum* java type for *SYMBOL*. Current version of 
> Calcite generates *Object[]*, which delivers type casting error, or some 
> method compatible problems {color:#33}when Calcite tries to pass enum 
> parameters to an aggregate function;{color}
>  # Add SQL-to-Rel test cases for JSON functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2670) Combine similar JSON aggregate functions in operator table

2018-11-15 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688817#comment-16688817
 ] 

Hongze Zhang edited comment on CALCITE-2670 at 11/16/18 12:06 AM:
--

bq. Why did you convert some parameters from particular types (e.g. 
SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum?

Thanks [~julianhyde] and I missed explaining this point.

In the patch, I have added SYMBOL-to-Enum type transform to 
JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does 
not know the original type of a SYMBOL.
And rel implementer applies explicitly type casting when translating SYMBOL 
parameter that is from an aggregate call, so if we transform SYMBOL to Enum in 
JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods 
in SqlFunctions.java.

It works before because there were methods like 
SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like 
JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch.

And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to 
carry a particular enum type instead of using BasicSqlType, we could use 
particular types in SqlFunctions.java. But this way needs more code changes, 
and JavaRecordType is still experimental and not recommended for use.



was (Author: zhztheplayer):
bq. Why did you convert some parameters from particular types (e.g. 
SqlJsonQueryEmptyOrErrorBehavior) to untyped Enum?

Thanks [~julianhyde] and I missed explaining this point.

In the patch, I have added SYMBOL-to-Enum type transform to 
JavaTypeFactoryImpl#getJavaClass. That is, the built-in rel implementer does 
not know the original type of a SYMBOL.
And rel implementer applies explicitly type casting when translating SYMBOL 
parameter that is from an aggregate call, so if we transform SYMBOL to Enum in 
JavaTypeFactoryImpl#getJavaClass, we can only use Enum for the related methods 
in SqlFunctions.java.

It works before because there were methods like 
SqlFunctions#jsonObjectAggAddNullOnNull that is for the specific operator like 
JSON_OBJECTAGG_NULL_ON_NULL, I have deleted these methods in the patch.

And SYMBOL-to-Enum type casting is a trade-off - If we use a JavaRecordType to 
carry a particular enum type instead of using RelDataType, we could use 
particular types in SqlFunctions.java. But this way needs more code changes, 
and JavaRecordType is still experimental and not recommended for use.


> Combine similar JSON aggregate functions in operator table
> --
>
> Key: CALCITE-2670
> URL: https://issues.apache.org/jira/browse/CALCITE-2670
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Hongze Zhang
>Assignee: Julian Hyde
>Priority: Major
>
> In current code master of Calcite, operator *JSON_ARRAYAGG* and 
> *JSON_OBJECTAGG* are separated to *JSON_ARRAYAGG_NULL_ON_NULL*, 
> *JSON_ARRAYAGG_ABSENT_ON_NULL*, *JSON_OBJECTAGG_NULL_ON_NULL* and 
> *JSON_OBJECTAGG_ABSENT_ON_NULL* in SqlStdOperatorTable.java. This is OK for 
> now, but may deliver more difficulty on extending syntax of *JSON_ARRAYAGG* 
> and *JSON_OBJECTAGG* and on implementing logical plan by adapters.
> This is basically to combine the 4 operators to 2, this is a improvement list:
>  # Combine *JSON_ARRAYAGG_NULL_ON_NULL* and *JSON_ARRAYAGG_ABSENT_ON_NULL;*
>  # Combine *JSON_OBJECTAGG_NULL_ON_NULL* and *JSON_OBJECTAGG_ABSENT_ON_NULL;*
>  # *SYMBOL* type Support for *JavaTypeFactoryImpl#getJavaClass*
>  This is to generate *Enum* java type for *SYMBOL*. Current version of 
> Calcite generates *Object[]*, which delivers type casting error, or some 
> method compatible problems {color:#33}when Calcite tries to pass enum 
> parameters to an aggregate function;{color}
>  # Add SQL-to-Rel test cases for JSON functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)