[jira] [Commented] (HIVE-14568) Hive Decimal Returns NULL

2016-09-19 Thread Akhil Chalamalasetty (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503847#comment-15503847
 ] 

Akhil Chalamalasetty commented on HIVE-14568:
-

Thanks Zhang. We will workaround this issue by casting the column to a lower 
precision & scale. 
Since we have a few developers migrating from ORACLE and Postgres SQL, we 
thought this would be a feature request to ease the usage of Hive. Please let 
us know if there is a way to introduce such a mode on Hive and if that would 
have a any performance impacts once implemented.

Regards,
AKhil

> Hive Decimal Returns NULL
> -
>
> Key: HIVE-14568
> URL: https://issues.apache.org/jira/browse/HIVE-14568
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0
> Environment: Centos 6.7, Hadoop 2.7.2,hive 1.0.0,2.0
>Reporter: gurmukh singh
>Assignee: Xuefu Zhang
>
> Hi
> I was under the impression that the bug: 
> https://issues.apache.org/jira/browse/HIVE-5022 got fixed. But, I see the 
> same issue in Hive 1.0 and hive 1.2 as well.
> hive> desc mul_table;
> OK
> prc   decimal(38,28)
> vol   decimal(38,10)
> Time taken: 0.068 seconds, Fetched: 2 row(s)
> hive> select prc, vol, prc*vol as cost from mul_table;
> OK
> 1.2   200 NULL
> 1.44  200 NULL
> 2.14  100 NULL
> 3.004 50  NULL
> 1.2   200 NULL
> Time taken: 0.048 seconds, Fetched: 5 row(s)
> Rather then returning NULL, it should give error or round off.
> I understand that, I can use Double instead of decimal or can cast it, but 
> still returning "Null" will make many things go unnoticed.
> hive> desc mul_table2;
> OK
> prc   double
> vol   decimal(14,10)
> Time taken: 0.049 seconds, Fetched: 2 row(s)
> hive> select * from mul_table2;
> OK
> 1.4   200
> 1.34  200
> 7.34  100
> 7454533.354544100
> Time taken: 0.028 seconds, Fetched: 4 row(s)
> hive> select prc, vol, prc*vol  as cost from mul_table3;
> OK
> 7.34  100 734.0
> 7.34  10007340.0
> 1.000410001000.4
> 7454533.354544100 7.454533354544E8   <- Wrong result
> 7454533.35454410007.454533354544E9   <- Wrong result
> Time taken: 0.025 seconds, Fetched: 5 row(s)
> Casting:
> hive> select prc, vol, cast(prc*vol as decimal(38,38)) as cost from 
> mul_table3;
> OK
> 7.34  100 NULL
> 7.34  1000NULL
> 1.00041000NULL
> 7454533.354544100 NULL
> 7454533.3545441000NULL
> Time taken: 0.033 seconds, Fetched: 5 row(s)
> hive> select prc, vol, cast(prc*vol as decimal(38,10)) as cost from 
> mul_table3;
> OK
> 7.34  100 734
> 7.34  10007340
> 1.000410001000.4
> 7454533.354544100 745453335.4544
> 7454533.35454410007454533354.544
> Time taken: 0.026 seconds, Fetched: 5 row(s) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14568) Hive Decimal Returns NULL

2016-08-31 Thread gurmukh singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15454201#comment-15454201
 ] 

gurmukh singh commented on HIVE-14568:
--

Thanks Xuefu Zhang 

> Hive Decimal Returns NULL
> -
>
> Key: HIVE-14568
> URL: https://issues.apache.org/jira/browse/HIVE-14568
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0
> Environment: Centos 6.7, Hadoop 2.7.2,hive 1.0.0,2.0
>Reporter: gurmukh singh
>Assignee: Xuefu Zhang
>
> Hi
> I was under the impression that the bug: 
> https://issues.apache.org/jira/browse/HIVE-5022 got fixed. But, I see the 
> same issue in Hive 1.0 and hive 1.2 as well.
> hive> desc mul_table;
> OK
> prc   decimal(38,28)
> vol   decimal(38,10)
> Time taken: 0.068 seconds, Fetched: 2 row(s)
> hive> select prc, vol, prc*vol as cost from mul_table;
> OK
> 1.2   200 NULL
> 1.44  200 NULL
> 2.14  100 NULL
> 3.004 50  NULL
> 1.2   200 NULL
> Time taken: 0.048 seconds, Fetched: 5 row(s)
> Rather then returning NULL, it should give error or round off.
> I understand that, I can use Double instead of decimal or can cast it, but 
> still returning "Null" will make many things go unnoticed.
> hive> desc mul_table2;
> OK
> prc   double
> vol   decimal(14,10)
> Time taken: 0.049 seconds, Fetched: 2 row(s)
> hive> select * from mul_table2;
> OK
> 1.4   200
> 1.34  200
> 7.34  100
> 7454533.354544100
> Time taken: 0.028 seconds, Fetched: 4 row(s)
> hive> select prc, vol, prc*vol  as cost from mul_table3;
> OK
> 7.34  100 734.0
> 7.34  10007340.0
> 1.000410001000.4
> 7454533.354544100 7.454533354544E8   <- Wrong result
> 7454533.35454410007.454533354544E9   <- Wrong result
> Time taken: 0.025 seconds, Fetched: 5 row(s)
> Casting:
> hive> select prc, vol, cast(prc*vol as decimal(38,38)) as cost from 
> mul_table3;
> OK
> 7.34  100 NULL
> 7.34  1000NULL
> 1.00041000NULL
> 7454533.354544100 NULL
> 7454533.3545441000NULL
> Time taken: 0.033 seconds, Fetched: 5 row(s)
> hive> select prc, vol, cast(prc*vol as decimal(38,10)) as cost from 
> mul_table3;
> OK
> 7.34  100 734
> 7.34  10007340
> 1.000410001000.4
> 7454533.354544100 745453335.4544
> 7454533.35454410007454533354.544
> Time taken: 0.026 seconds, Fetched: 5 row(s) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14568) Hive Decimal Returns NULL

2016-08-31 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15453811#comment-15453811
 ] 

Xuefu Zhang commented on HIVE-14568:


I think this is mostly by design. You have two columns: decimal(p1, s1) and 
decimal(p2,s2). We need to statically derive the type for the product of the 
two columns based on s = s1 + s2 and p1 = p1 + p2 +1. since your s1 = 28 and s2 
= 10 in your case, then s = 38.  Similarly, p = 38 (which is the max). Thus, 
the result column has a type decimal(38, 38). This basically means that the 
result cannot have any integer part. On the other hand, if the result type is 
set as (38, 18), I can certainly construct example data which shows that the 
production of the two column loses the scale that I was expecting.

I understand that NULL may have been surprising to people. However, I wonder 
why a column defined as decimal (38,28) to be used to store data like 1.2, 
1.44, etc. Is it reasonable to have a smaller precision/scale?

This sounds like a data modeling issue. the metadata needs to closely define 
the data.

It's a good point that an ERROR here might be better so that NULL doesn't slick 
in unnoticed. I believe that in MySQL there is a strict mode, which, when on, 
will generate error in this case. We don't have such mode defined in Hive, but 
it may make sense to introduce such a mode.

> Hive Decimal Returns NULL
> -
>
> Key: HIVE-14568
> URL: https://issues.apache.org/jira/browse/HIVE-14568
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0
> Environment: Centos 6.7, Hadoop 2.7.2,hive 1.0.0,2.0
>Reporter: gurmukh singh
>Assignee: Xuefu Zhang
>
> Hi
> I was under the impression that the bug: 
> https://issues.apache.org/jira/browse/HIVE-5022 got fixed. But, I see the 
> same issue in Hive 1.0 and hive 1.2 as well.
> hive> desc mul_table;
> OK
> prc   decimal(38,28)
> vol   decimal(38,10)
> Time taken: 0.068 seconds, Fetched: 2 row(s)
> hive> select prc, vol, prc*vol as cost from mul_table;
> OK
> 1.2   200 NULL
> 1.44  200 NULL
> 2.14  100 NULL
> 3.004 50  NULL
> 1.2   200 NULL
> Time taken: 0.048 seconds, Fetched: 5 row(s)
> Rather then returning NULL, it should give error or round off.
> I understand that, I can use Double instead of decimal or can cast it, but 
> still returning "Null" will make many things go unnoticed.
> hive> desc mul_table2;
> OK
> prc   double
> vol   decimal(14,10)
> Time taken: 0.049 seconds, Fetched: 2 row(s)
> hive> select * from mul_table2;
> OK
> 1.4   200
> 1.34  200
> 7.34  100
> 7454533.354544100
> Time taken: 0.028 seconds, Fetched: 4 row(s)
> hive> select prc, vol, prc*vol  as cost from mul_table3;
> OK
> 7.34  100 734.0
> 7.34  10007340.0
> 1.000410001000.4
> 7454533.354544100 7.454533354544E8   <- Wrong result
> 7454533.35454410007.454533354544E9   <- Wrong result
> Time taken: 0.025 seconds, Fetched: 5 row(s)
> Casting:
> hive> select prc, vol, cast(prc*vol as decimal(38,38)) as cost from 
> mul_table3;
> OK
> 7.34  100 NULL
> 7.34  1000NULL
> 1.00041000NULL
> 7454533.354544100 NULL
> 7454533.3545441000NULL
> Time taken: 0.033 seconds, Fetched: 5 row(s)
> hive> select prc, vol, cast(prc*vol as decimal(38,10)) as cost from 
> mul_table3;
> OK
> 7.34  100 734
> 7.34  10007340
> 1.000410001000.4
> 7454533.354544100 745453335.4544
> 7454533.35454410007454533354.544
> Time taken: 0.026 seconds, Fetched: 5 row(s) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)