[jira] [Comment Edited] (CALCITE-5838) derive wrong decimal divide type

pengfei.zhan (Jira) Tue, 11 Jul 2023 18:28:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742225#comment-17742225
 ]


pengfei.zhan edited comment on CALCITE-5838 at 7/12/23 1:27 AM:
----------------------------------------------------------------

Yes, this is a solution to the problem.

But in most scenarios, this algorithm from calcite looks not very elegant.
{code:java}
int dout =Math.min( p1 - s1 + s2, maxNumericPrecision);
int scale = Math.max(6, s1 + p2 + 1);
scale = Math.min(scale,maxNumericPrecision - dout);
scale = Math.min(scale, getMaxNumericScale());
{code}
In most cases, if p1 = 38 and s1 < s2, then dout equals to maxNumericPrecision 
with no doubt, which for hive is 38. According to `scale = 
Math.min(scale,maxNumericPrecision - dout);` This will always produce a scale 
of 0. With `scale = Math.min(scale, getMaxNumericScale());` the final scale 
will be 0. For a decimal datatype, scale missing seems unacceptable.

Of course, if the numerical integers involved in the computation are indeed 
very large, then the final computation will inevitably result in an overflow. 
The algorithm provided by Hive and MS sqlserver will also appear this 
embarrassed problem in this situation.


was (Author: JIRAUSER294653):
Yes, this is a solution to the problem.

But in most scenarios, this algorithm from calcite looks not very elegant.
{code:java}
int dout =Math.min( p1 - s1 + s2, maxNumericPrecision);
int scale = Math.max(6, s1 + p2 + 1);
scale = Math.min(scale,maxNumericPrecision - dout);
scale = Math.min(scale, getMaxNumericScale());
{code}
In most cases, if p1 = 38 and s1 < s2, then dout equals to maxNumericPrecision 
with no doubt, which for hive is 38. According to `scale = 
Math.min(scale,maxNumericPrecision - dout);` This will always produce a scale 
of 0. With `scale = Math.min(scale, getMaxNumericScale());` the final scale 
will be 0. For a decimal datatype, this looks unacceptable.

Of course, if the numerical integers involved in the computation are indeed 
very large, then the final computation will inevitably result in an overflow. 
The algorithm provided by Hive and MS sqlserver will also appear this 
embarrassed problem in this situation.

> derive wrong decimal divide type
> --------------------------------
>
>                 Key: CALCITE-5838
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5838
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.34.0
>            Reporter: pengfei.zhan
>            Priority: Major
>
> Given a table metadata structure as follows:
> {code:sql}
> create table `tbl_test`(`CRR` decimal(38,2), `DT` varchar(4096)) ROW FORMAT 
> DELIMITED FIELDS TERMINATED BY '\t';
> {code}
> If the following query is executed,
> {code:sql}
> select sum(CRR_0/1000000000.02) from (select sum(CRR) as CRR_0 from tbl_test 
> where DT='20230705') T1
> {code}
> The RexCall CRR_0/1000000000.02 derived a decimal type of decimal(38, 0). I 
> referred the derived type in spark, but it gives decimal(38, 4). It looks 
> like the inferred type is more reasonable. 
> [hive doc | 
> https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf]
> [sql server doc | https://msdn.microsoft.com/en-us/library/ms190476.aspx]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-5838) derive wrong decimal divide type

Reply via email to