from:"Ashish Sharma \(Jira\)"

[jira] [Updated] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-06-06 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26082:
-
Issue Type: Bug  (was: Task)

> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Upgrade 
> datanucleus-api-jdo 5.2.4 to  5.2.8
> datanucleus-core 5.2.4 to 5.2.10
> datanucleus-rdbms 5.2.4 to 5.2.10



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Resolved] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-06-06 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-26082.
--
Resolution: Duplicate

> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Upgrade 
> datanucleus-api-jdo 5.2.4 to  5.2.8
> datanucleus-core 5.2.4 to 5.2.10
> datanucleus-rdbms 5.2.4 to 5.2.10



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Work started] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26082 started by Ashish Sharma.

> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade 
> datanucleus-api-jdo 5.2.4 to  5.2.8
> datanucleus-core 5.2.4 to 5.2.10
> datanucleus-rdbms 5.2.4 to 5.2.10



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26082:
-
Priority: Minor  (was: Major)

> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Upgrade 
> datanucleus-api-jdo 5.2.4 to  5.2.8
> datanucleus-core 5.2.4 to 5.2.10
> datanucleus-rdbms 5.2.4 to 5.2.10



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26082:
-
Description: 
Upgrade 

datanucleus-api-jdo 5.2.4 to  5.2.8
datanucleus-core 5.2.4 to 5.2.10
datanucleus-rdbms 5.2.4 to 5.2.10

  was:Upgrade 5.2.4 to 5.2.6


> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade 
> datanucleus-api-jdo 5.2.4 to  5.2.8
> datanucleus-core 5.2.4 to 5.2.10
> datanucleus-rdbms 5.2.4 to 5.2.10



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.8

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26082:
-
Summary: Upgrade DataNucleus dependency to 5.2.8  (was: Upgrade DataNucleus 
dependency to 5.2.6)

> Upgrade DataNucleus dependency to 5.2.8
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade 5.2.4 to 5.2.6



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-26082) Upgrade DataNucleus dependency to 5.2.6

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-26082:



> Upgrade DataNucleus dependency to 5.2.6
> ---
>
> Key: HIVE-26082
> URL: https://issues.apache.org/jira/browse/HIVE-26082
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade 5.2.4 to 5.2.6



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Resolved] (HIVE-25516) ITestDbTxnManager is broken after HIVE-24120

2022-03-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-25516.
--
Resolution: Resolved

> ITestDbTxnManager is broken after HIVE-24120
> 
>
> Key: HIVE-25516
> URL: https://issues.apache.org/jira/browse/HIVE-25516
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-26081) Upgrade ant to 1.10.9

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26081 started by Ashish Sharma.

> Upgrade ant to 1.10.9
> -
>
> Key: HIVE-26081
> URL: https://issues.apache.org/jira/browse/HIVE-26081
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Upgrade org.apache.ant:ant from 1.9.1 to 1.10.9 to fix the vulnerability 
> CVE-2020-11979



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-26081) Upgrade ant to 1.10.9

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-26081:



> Upgrade ant to 1.10.9
> -
>
> Key: HIVE-26081
> URL: https://issues.apache.org/jira/browse/HIVE-26081
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade org.apache.ant:ant from 1.9.1 to 1.10.9 to fix the vulnerability 
> CVE-2020-11979



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-26080) Upgrade accumulo-core to 1.10.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26080 started by Ashish Sharma.

> Upgrade accumulo-core to 1.10.1
> ---
>
> Key: HIVE-26080
> URL: https://issues.apache.org/jira/browse/HIVE-26080
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade org.apache.accumulo:accumulo-core from 1.7.0 to 1.10.1 to fix the 
> vulnerability CVE-2020-17533



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-26080) Upgrade accumulo-core to 1.10.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-26080:



> Upgrade accumulo-core to 1.10.1
> ---
>
> Key: HIVE-26080
> URL: https://issues.apache.org/jira/browse/HIVE-26080
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade org.apache.accumulo:accumulo-core from 1.7.0 to 1.10.1 to fix the 
> vulnerability CVE-2020-17533



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26079:
-
Description: Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 
to fix CVE-2021-22569  (was: Upgrade com.google.protobuf:protobuf-java to 
3.16.1 to fix CVE-2021-22569)

> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 to fix 
> CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26079 started by Ashish Sharma.

> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 to fix 
> CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-26079:
-
Description: Upgrade com.google.protobuf:protobuf-java to 3.16.1 to fix 
CVE-2021-22569  (was: Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 
3.16.1 to fix CVE-2021-22569)

> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade com.google.protobuf:protobuf-java to 3.16.1 to fix CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-26079:



> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 to fix 
> CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-26078) Upgrade gson to 2.8.9

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26078 started by Ashish Sharma.

> Upgrade gson to 2.8.9
> -
>
> Key: HIVE-26078
> URL: https://issues.apache.org/jira/browse/HIVE-26078
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade to version com.google.code.gson:gson:2.8.9 to avoid WS-2021-0419



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-26078) Upgrade gson to 2.8.9

2022-03-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-26078:



> Upgrade gson to 2.8.9
> -
>
> Key: HIVE-26078
> URL: https://issues.apache.org/jira/browse/HIVE-26078
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Upgrade to version com.google.code.gson:gson:2.8.9 to avoid WS-2021-0419



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HIVE-25446) Wrong execption thrown if capacity<=0

2022-03-10 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504745#comment-17504745
 ] 

Ashish Sharma commented on HIVE-25446:
--

Execption is generate when code try to find nextPowerofTwo() for values greater 
then 1073741824 which result in -ve number. Above problem is solved as part of 
https://issues.apache.org/jira/browse/HIVE-25583 . I am correcting the 
exception to first check for <=0 then check for power of 2.

> Wrong execption thrown if capacity<=0
> -
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) Wrong execption thrown if capacity<=0

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Summary: Wrong execption thrown if capacity<=0  (was: Wrong Execption 
thrown if capacity<=0)

> Wrong execption thrown if capacity<=0
> -
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) Wrong Execption thrown if capacity<=0

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Summary: Wrong Execption thrown if capacity<=0  (was: Wrong Execption 
thrown if capacity <= 0)

> Wrong Execption thrown if capacity<=0
> -
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) Wrong Execption thrown if capacity <= 0

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Summary: Wrong Execption thrown if capacity <= 0  (was: 
VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a 
power of two)

> Wrong Execption thrown if capacity <= 0
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Parent: HIVE-24037
Issue Type: Sub-task  (was: Task)

> VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be 
> a power of two
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25446 started by Ashish Sharma.

> VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be 
> a power of two
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Issue Type: Task  (was: Bug)

> VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be 
> a power of two
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Major
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2022-03-10 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25446:
-
Priority: Trivial  (was: Major)

> VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be 
> a power of two
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Trivial
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2022-03-09 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25446:


Assignee: Ashish Sharma  (was: Matt McCline)

> VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be 
> a power of two
> ---
>
> Key: HIVE-25446
> URL: https://issues.apache.org/jira/browse/HIVE-25446
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Matt McCline
>Assignee: Ashish Sharma
>Priority: Major
> Fix For: 4.0.0
>
>
> Encountered this in a very large query:
> Caused by: java.lang.AssertionError: Capacity must be a power of two
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)
>    at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)
>    at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)
>    at 
> org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-25985) Estimate stats gives out incorrect number of columns during query planning when using predicates like c=22

2022-02-26 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25985:


Assignee: Ashish Sharma

> Estimate stats gives out incorrect number of columns during query planning 
> when using predicates like c=22
> --
>
> Key: HIVE-25985
> URL: https://issues.apache.org/jira/browse/HIVE-25985
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
> Environment: Hive 3
>Reporter: Sindhu Subhas
>Assignee: Ashish Sharma
>Priority: Major
>
> Table type: External 
> Stats: No stats collected.
> When filter operator appeared in the plan and the row estimates went bad. 
> Changed the original query on table and modifying the filter predicate form.
>  
> |*predicate form*|*optimised as* |*filter Op rows out*|*estimate quality*|
> |prd_i_tmp.type = '22'|predicate:(type = '22')|Filter Operator [FIL_12] 
> (rows=5 width=3707) \||bad|
> |prd_i_tmp.type in ('22')|predicate:(type = '22')|Filter Operator [FIL_12] 
> (rows=5 width=3707)|bad|
> |prd_i_tmp.type < '23' and prd_i_tmp.type > '21'|predicate:((type < '23') and 
> (type > '21'))|Filter Operator [FIL_12] (rows=8706269 width=3707) |good|
> |prd_i_tmp.type like '22'|predicate:(type like '22')|Filter Operator [FIL_12] 
> (rows=39178213 width=3707)|best|
> |prd_i_tmp.type in ('22','AA','BB')|predicate:(type) IN ('22', 'AA', 
> 'BB')|Filter Operator [FIL_12] (rows=15 width=3707)|bad|
> |prd_i_tmp.type rlike '22'|predicate:type regexp '22'| Filter Operator 
> [FIL_12] (rows=39178213 width=3707)|good|



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (HIVE-25653) Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-12-20 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462512#comment-17462512
 ] 

Ashish Sharma edited comment on HIVE-25653 at 12/20/21, 11:28 AM:
--

[~zabetak] 

I have corrected the column data from int to decimal in the example. 

*Code Snippet* 

public class MyClass {
public static void main(String args[]) {
  System.out.println(10230.72+10230.72+10230.72);
}
}

*Output* - 

30692.1596

*Expected* - 

30692.16


Because of the double and floating point arithmetic accuracy issue in java 
language. Output of UDFs and UDAFs like STDDEV are getting affected hence we 
are getting output (*0.5940794514955821*) instead of (*0.0*)

But since other engine like MYSQL, POSTGRES are returning the similar value. So 
we can ignore arithmetic accuracy issue in UDFs for now.

I have created a revert request - https://github.com/apache/hive/pull/2897


was (Author: ashish-kumar-sharma):
[~zabetak] 

*Code Snippet* 

public class MyClass {
public static void main(String args[]) {
  System.out.println(10230.72+10230.72+10230.72);
}
}

*Output* - 

30692.1596

*Expected* - 

30692.16


Because of the double and floating point arithmetic accurcy issue in java 
language. Output of UDFs and UDAFs like STDDEV are getting affected hence we 
are getting output (*0.5940794514955821*) instead of (*0.0*)

But since other engine like MYSQL are returning the same value. So we can 
ignore arithmetic accuracy issue in UDFs for now.

I have created a revert request - https://github.com/apache/hive/pull/2897

> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> 
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 3.1.0, 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 decimal );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25653) Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-12-20 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25653:
-
Description: 
Description

*Script*- 

create table test ( col1 decimal );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

*Result*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

*Expected*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 
0   0   
 0

  was:
Description

*Script*- 

create table test ( col1 int );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

*Result*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

*Expected*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 
0   0   
 0


> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> 
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 3.1.0, 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 decimal );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (HIVE-25653) Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-12-20 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462512#comment-17462512
 ] 

Ashish Sharma edited comment on HIVE-25653 at 12/20/21, 11:19 AM:
--

[~zabetak] 

*Code Snippet* 

public class MyClass {
public static void main(String args[]) {
  System.out.println(10230.72+10230.72+10230.72);
}
}

*Output* - 

30692.1596

*Expected* - 

30692.16


Because of the double and floating point arithmetic accurcy issue in java 
language. Output of UDFs and UDAFs like STDDEV are getting affected hence we 
are getting output (*0.5940794514955821*) instead of (*0.0*)

But since other engine like MYSQL are returning the same value. So we can 
ignore arithmetic accuracy issue in UDFs for now.

I have created a revert request - https://github.com/apache/hive/pull/2897


was (Author: ashish-kumar-sharma):
[~zabetak] You are right. I will raise a request to revert the commit.

I have created a revert request - https://github.com/apache/hive/pull/2897

> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> 
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 3.1.0, 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (HIVE-25653) Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-12-20 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462512#comment-17462512
 ] 

Ashish Sharma edited comment on HIVE-25653 at 12/20/21, 10:40 AM:
--

[~zabetak] You are right. I will raise a request to revert the commit.

I have created a revert request - https://github.com/apache/hive/pull/2897


was (Author: ashish-kumar-sharma):
[~zabetak] You are right. I will raise a request to revert the commit.

> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> 
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 3.1.0, 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HIVE-25653) Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-12-20 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462512#comment-17462512
 ] 

Ashish Sharma commented on HIVE-25653:
--

[~zabetak] You are right. I will raise a request to revert the commit.

> Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating 
> point data types.
> 
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 3.1.0, 3.1.2
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Description
> *Script*- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-25693) Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-11-12 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25693:


Assignee: Ashish Sharma

> Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, 
> STDDEV_POP for floating point data types.
> -
>
> Key: HIVE-25693
> URL: https://issues.apache.org/jira/browse/HIVE-25693
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> Script-
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> Result-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> Expected-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 0 0 0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work started] (HIVE-25693) Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-11-12 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25693 started by Ashish Sharma.

> Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, 
> STDDEV_POP for floating point data types.
> -
>
> Key: HIVE-25693
> URL: https://issues.apache.org/jira/browse/HIVE-25693
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> Script-
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> Result-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> Expected-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 0 0 0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HIVE-25693) Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types.

2021-11-12 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25693:
-
Description: 
Description

Script-

create table test ( col1 int );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

Result-

STDDDEV_SAMP STDDEV STDDEV_POP

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

Expected-

STDDDEV_SAMP STDDEV STDDEV_POP
0 0 0

> Vector implementation return Incorrect results for STDDEV, STDDEV_SAMP, 
> STDDEV_POP for floating point data types.
> -
>
> Key: HIVE-25693
> URL: https://issues.apache.org/jira/browse/HIVE-25693
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Priority: Major
>
> Description
> Script-
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> Result-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> Expected-
> STDDDEV_SAMP STDDEV STDDEV_POP
> 0 0 0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (HIVE-25678) java.lang.ClassCastException in COALESCE()

2021-11-05 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25678:



> java.lang.ClassCastException in COALESCE()
> --
>
> Key: HIVE-25678
> URL: https://issues.apache.org/jira/browse/HIVE-25678
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.BooleanWritable
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableBooleanObjectInspector.get(WritableBooleanObjectInspector.java:36)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getBoolean(PrimitiveObjectInspectorUtils.java:514)
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$BooleanConverter.convert(PrimitiveObjectInspectorConverter.java:67)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ReturnObjectInspectorResolver.convertIfNecessary(GenericUDFUtils.java:247)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ReturnObjectInspectorResolver.convertIfNecessary(GenericUDFUtils.java:213)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:105)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual.evaluate(GenericUDFOPEqual.java:114)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:96)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:96)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:65)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:93)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorHead._evaluate(ExprNodeEvaluatorHead.java:44)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25653) Precision problem in STD, STDDDEV_SAMP,STDDEV_POP

2021-10-28 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25653 started by Ashish Sharma.

> Precision problem in STD, STDDDEV_SAMP,STDDEV_POP
> -
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> *Script*- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25653) Precision problem in STD, STDDDEV_SAMP,STDDEV_POP

2021-10-27 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25653:
-
Description: 
Description

*Script *- 

create table test ( col1 int );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

*Result *- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

*Expected *- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 
0   0   
 0

  was:
Description




> Precision problem in STD, STDDDEV_SAMP,STDDEV_POP
> -
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> *Script *- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result *- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected *- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25653) Precision problem in STD, STDDDEV_SAMP,STDDEV_POP

2021-10-27 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25653:
-
Description: 
Description

*Script*- 

create table test ( col1 int );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

*Result*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

*Expected*- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 
0   0   
 0

  was:
Description

*Script *- 

create table test ( col1 int );

insert into values 
('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');

select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
,STDDEV_POP(col1) as STDDEV_POP from test;

*Result *- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 

5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13

*Expected *- 

STDDDEV_SAMPSTDDEV  STDDEV_POP 
0   0   
 0


> Precision problem in STD, STDDDEV_SAMP,STDDEV_POP
> -
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> *Script*- 
> create table test ( col1 int );
> insert into values 
> ('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72'),('10230.72');
> select STDDEV_SAMP(col1) AS STDDEV_6M , STDDEV(col1) as STDDEV 
> ,STDDEV_POP(col1) as STDDEV_POP from test;
> *Result*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 5.940794514955821E-13 5.42317860890711E-13 5.42317860890711E-13
> *Expected*- 
> STDDDEV_SAMPSTDDEV  
> STDDEV_POP 
> 0   0 
>0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25653) Precision problem in STD, STDDDEV_SAMP,STDDEV_POP

2021-10-27 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25653:



> Precision problem in STD, STDDDEV_SAMP,STDDEV_POP
> -
>
> Key: HIVE-25653
> URL: https://issues.apache.org/jira/browse/HIVE-25653
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Add config to parse date with older date format

2021-10-12 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Affects Version/s: 3.1.0
   3.0.0
   3.1.1
   3.1.2

> Add config to parse date with older date format
> ---
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Add config to parse date with older date format

2021-10-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Summary: Add config to parse date with older date format  (was: Add 
"hive.legacy.timeParserPolicy" config to parse date with older date fromat)

> Add config to parse date with older date format
> ---
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Add "hive.legacy.timeParserPolicy" config to parse date with older date fromat

2021-10-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Summary: Add "hive.legacy.timeParserPolicy" config to parse date with older 
date fromat  (was: Raise exception instead of silent change for new 
DateTimeformatter)

> Add "hive.legacy.timeParserPolicy" config to parse date with older date fromat
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25576) Raise exception instead of silent change for new DateTimeformatter

2021-10-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25576 started by Ashish Sharma.

> Raise exception instead of silent change for new DateTimeformatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateTimeformatter

2021-10-03 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

*Implementation details* - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
1. *True*- use *SimpleDateFormat* 
2. *False*  - use *DateTimeFormatter*


Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668



  was:
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

*Implementation details* - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
True- use *SimpleDateFormat* 
False  - use *DateTimeFormatter*


Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668




> Raise exception instead of silent change for new DateTimeformatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateTimeformatter

2021-10-03 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

*Implementation details* - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
True- use *SimpleDateFormat* 
False  - use *DateTimeFormatter*


Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668



  was:
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

*Implementation details* - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668




> Raise exception instead of silent change for new DateTimeformatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime =

[jira] [Commented] (HIVE-25576) Raise exception instead of silent change for new DateTimeformatter

2021-10-03 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423766#comment-17423766
 ] 

Ashish Sharma commented on HIVE-25576:
--

[~zabetak] Sure I will change this config to Boolean. 

> Raise exception instead of silent change for new DateTimeformatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
> raise exception if doesn't match 
> LEGACY - use *SimpleDateFormat* 
> CORRECTED  - use *DateTimeFormatter*
> This will help hive user in following manner - 
> 1. Migrate to new version using *LEGACY*
> 2. Find values which are not compatible with new version - *EXCEPTION*
> 3. Use latest date apis - *CORRECTED*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25577) unix_timestamp() is ignoring the time zone value

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25577:
-
Issue Type: Bug  (was: Improvement)

> unix_timestamp() is ignoring the time zone value
> 
>
> Key: HIVE-25577
> URL: https://issues.apache.org/jira/browse/HIVE-25577
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> set hive.local.time.zone=Asia/Bangkok;
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('2000-01-07 00:00:00 
> GMT','-MM-dd HH:mm:ss z'));
> Result - 2000-01-07 00:00:00 ICT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25577) unix_timestamp() is ignoring the time zone value

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25577 started by Ashish Sharma.

> unix_timestamp() is ignoring the time zone value
> 
>
> Key: HIVE-25577
> URL: https://issues.apache.org/jira/browse/HIVE-25577
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> set hive.local.time.zone=Asia/Bangkok;
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('2000-01-07 00:00:00 
> GMT','-MM-dd HH:mm:ss z'));
> Result - 2000-01-07 00:00:00 ICT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateTimeformatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Summary: Raise exception instead of silent change for new DateTimeformatter 
 (was: Raise exception instead of silent change for new DateFormatter)

> Raise exception instead of silent change for new DateTimeformatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
> raise exception if doesn't match 
> LEGACY - use *SimpleDateFormat* 
> CORRECTED  - use *DateTimeFormatter*
> This will help hive user in following manner - 
> 1. Migrate to new version using *LEGACY*
> 2. Find values which are not compatible with new version - *EXCEPTION*
> 3. Use latest date apis - *CORRECTED*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25577) unix_timestamp() is ignoring the time zone value

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25577:



> unix_timestamp() is ignoring the time zone value
> 
>
> Key: HIVE-25577
> URL: https://issues.apache.org/jira/browse/HIVE-25577
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> set hive.local.time.zone=Asia/Bangkok;
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('2000-01-07 00:00:00 
> GMT','-MM-dd HH:mm:ss z'));
> Result - 2000-01-07 00:00:00 ICT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668



  was:
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

*Implementation details* - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668



  was:
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 07:00:00

*Implementation details* - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result* - 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*

Note: apache spark also face the same issue 
https://issues.apache.org/jira/browse/SPARK-30668



  was:
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> implementation

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History*

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*



  was:
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> implementation details - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime =

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION *
3. Use latest date apis - "CORRECTED"



  was:
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat *has been replaced with *DateTimeFormatter  *which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION *
3. Use latest date apis - "CORRECTED"




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History *
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> implementation details - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime =

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION*
3. Use latest date apis - *CORRECTED*



  was:
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION *
3. Use latest date apis - "CORRECTED"




> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History *
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> implementation details - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime =

[jira] [Updated] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25576:
-
Description: 
*History *

*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

implementation details - 

SimpleDateFormat formatter = new SimpleDateFormat(pattern);
Long unixtime = formatter.parse(textval).getTime() / 1000;
Date date = new Date(unixtime * 1000L);

https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
documentation they have mention that "Unfortunately, the API for these 
functions was not amenable to internationalization and The corresponding 
methods in Date are deprecated" . Due to that this is producing wrong result

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

*Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

*Result *- 1800-01-01 06:42:04

implementation details - 

DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.appendPattern(pattern)
.toFormatter();

ZonedDateTime zonedDateTime = 
ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
Long dttime = zonedDateTime.toInstant().getEpochSecond();


*Problem*- 

Now *SimpleDateFormat *has been replaced with *DateTimeFormatter  *which is 
giving the correct result but it is not backword compatible. Which is causing 
issue at time for migration to new version. Because the older data written is 
using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.

*Solution*

Introduce an config "hive.legacy.timeParserPolicy" with following values -
EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
raise exception if doesn't match 
LEGACY - use *SimpleDateFormat* 
CORRECTED  - use *DateTimeFormatter*

This will help hive user in following manner - 
1. Migrate to new version using *LEGACY*
2. Find values which are not compatible with new version - *EXCEPTION *
3. Use latest date apis - "CORRECTED"



  was:
*Hive 1.2* - 

VM time zone set to Asia/Bangkok

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 07:00:00

*Master branch* - 

set hive.local.time.zone=Asia/Bangkok;

Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
UTC','-MM-dd HH:mm:ss z'));

Result - 1800-01-01 06:42:04





> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *History *
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> implementation details - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query *- SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result *- 1800-01-01 06:42:04
> implementation details - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat *has been replaced with *DateTimeFormatter  *which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> EXCEPTION - compare value of both *SimpleDateFormat* & *DateTimeFormatter* 
> raise exception if doesn't match 
> LEGACY - use *SimpleDateFormat* 
> CORRECTED  - use *DateTimeFormatter*
> This will help hive user in following manner - 
> 1. Migrate to new version

[jira] [Assigned] (HIVE-25576) Raise exception instead of silent change for new DateFormatter

2021-09-29 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25576:



> Raise exception instead of silent change for new DateFormatter
> --
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 07:00:00
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> Query - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> Result - 1800-01-01 06:42:04



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417017#comment-17417017
 ] 

Ashish Sharma commented on HIVE-25535:
--

[~dkuzmenko] I agree with you. Now compactor is running in a transaction so 
problem like FileNotFound will not come. This config is more intended to 
HDP-3.1 and lower version users. Where Lock-based Cleaner is still running. 
Backporting compactor running in transaction is not straight forwards as it 
required metastore schema change. 

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416713#comment-17416713
 ] 

Ashish Sharma commented on HIVE-25535:
--

[~dkuzmenko] update the use case in description.

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416713#comment-17416713
 ] 

Ashish Sharma edited comment on HIVE-25535 at 9/17/21, 2:01 PM:


[~dkuzmenko] updated use case in description.


was (Author: ashish-kumar-sharma):
[~dkuzmenko] update the use case in description.

> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
*Use Case* - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

*Solution* - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Use Case* - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> *Solution* - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound* for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound *for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound* for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take acquires locks on the metastore artefacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like *FileNotFound *for delta directory because at time of 
spark acid compilation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take aquires locks on the metastore artifacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like FileNotFound for delta directory because at time of 
spark acid complitation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take acquires locks on the metastore artefacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like *FileNotFound *for delta directory 
> because at time of spark acid compilation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
to access hive metastore directly instead of accessing LLAP or hs2 which lacks 
the ability of take aquires locks on the metastore artifacts. Due to which if 
any spark acid jobs starts and at the same time compaction happens in hive with 
leads to exceptions like FileNotFound for delta directory because at time of 
spark acid complitation phase delta files are present but when execution start 
delta files are deleted by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like [SPARK_ACID |https://github.com/qubole/spark-acid]try 
> to access hive metastore directly instead of accessing LLAP or hs2 which 
> lacks the ability of take aquires locks on the metastore artifacts. Due to 
> which if any spark acid jobs starts and at the same time compaction happens 
> in hive with leads to exceptions like FileNotFound for delta directory 
> because at time of spark acid complitation phase delta files are present but 
> when execution start delta files are deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have 
"[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
 which allow us to delay the deletion of "obsolete directories/files" but it is 
applicable to all the table in metastore where this config will provide table 
and partition level control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
delay the deletion of "obsolete directories/files" but it is applicable to all 
the table in metastore where this config will provide table and partition level 
control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like SPARK_ACID try to access hive metastore directly 
> instead of accessing LLAP or hs2 which lacks the ability of take aquires 
> locks on the metastore artifacts. Due to which if any spark acid jobs starts 
> and at the same time compaction happens in hive with leads to exceptions like 
> FileNotFound for delta directory because at time of spark acid complitation 
> phase delta files are present but when execution start delta files are 
> deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have 
> "[HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED|https://github.com/apache/hive/blob/71583e322fe14a0cfcde639629b509b252b0ed2c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L3243];
>  which allow us to delay the deletion of "obsolete directories/files" but it 
> is applicable to all the table in metastore where this config will provide 
> table and partition level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Use Case - 

When external tool like SPARK_ACID try to access hive metastore directly 
instead of accessing LLAP or hs2 which lacks the ability of take aquires locks 
on the metastore artifacts. Due to which if any spark acid jobs starts and at 
the same time compaction happens in hive with leads to exceptions like 
FileNotFound for delta directory because at time of spark acid complitation 
phase delta files are present but when execution start delta files are deleted 
by compactor. 

Inorder to tackle problem like this I am proposing to add a config "NO_CLEANUP" 
is table properties and partition properties which provide higher control on 
table and partition compaction process. 

We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
delay the deletion of "obsolete directories/files" but it is applicable to all 
the table in metastore where this config will provide table and partition level 
control.

Solution - 

Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use Case - 
> When external tool like SPARK_ACID try to access hive metastore directly 
> instead of accessing LLAP or hs2 which lacks the ability of take aquires 
> locks on the metastore artifacts. Due to which if any spark acid jobs starts 
> and at the same time compaction happens in hive with leads to exceptions like 
> FileNotFound for delta directory because at time of spark acid complitation 
> phase delta files are present but when execution start delta files are 
> deleted by compactor. 
> Inorder to tackle problem like this I am proposing to add a config 
> "NO_CLEANUP" is table properties and partition properties which provide 
> higher control on table and partition compaction process. 
> We already have "HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED" which allow us to 
> delay the deletion of "obsolete directories/files" but it is applicable to 
> all the table in metastore where this config will provide table and partition 
> level control.
> Solution - 
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25535) Control cleaning obsolete directories/files of a table via property

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25535:
-
Description: 
Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
partition cleanup and prevent the cleaner process from automatically cleaning 
obsolete directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);

  was:
Add "NO_CLEANUP" in the table properties enable/disable the table-level cleanup 
and prevent the cleaner process from automatically cleaning obsolete 
directories/files.

Example - 

ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);


> Control cleaning obsolete directories/files of a table via property
> ---
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level and 
> partition cleanup and prevent the cleaner process from automatically cleaning 
> obsolete directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25535) Adding table property "NO_CLEANUP"

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25535 started by Ashish Sharma.

> Adding table property "NO_CLEANUP"
> --
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25535) Adding table property "NO_CLEANUP"

2021-09-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25535:



> Adding table property "NO_CLEANUP"
> --
>
> Key: HIVE-25535
> URL: https://issues.apache.org/jira/browse/HIVE-25535
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Add "NO_CLEANUP" in the table properties enable/disable the table-level 
> cleanup and prevent the cleaner process from automatically cleaning obsolete 
> directories/files.
> Example - 
> ALTER TABLE  SET TBLPROPERTIES('NO_CLEANUP'=FALSE/TRUE);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-22738) CVE-2019-0205

2021-09-15 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-22738.
--
Resolution: Duplicate

> CVE-2019-0205
> -
>
> Key: HIVE-22738
> URL: https://issues.apache.org/jira/browse/HIVE-22738
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Laurent Goujon
>Priority: Major
>
> There's has been a CVE issued for a Thrift vulnerability which might impact 
> Hive. The CVE is 
> [CVE-2019-0205|https://nvd.nist.gov/vuln/detail/CVE-2019-0205], impacts both 
> clients and servers, and might cause a denial of service through an infinite 
> loop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25233) Removing deprecated unix_timestamp UDF

2021-09-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-25233.
--
Resolution: Invalid

> Removing deprecated unix_timestamp UDF
> --
>
> Key: HIVE-25233
> URL: https://issues.apache.org/jira/browse/HIVE-25233
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Description
> Since unix_timestamp() UDF was deprecated as part of 
> https://issues.apache.org/jira/browse/HIVE-10728. Internal 
> GenericUDFUnixTimeStamp extend GenericUDFToUnixTimeStamp and call 
> to_utc_timestamp() for unix_timestamp(string date) & unix_timestamp(string 
> date, string pattern).
> unix_timestamp()   => CURRENT_TIMESTAMP
> unix_timestamp(string date) => to_unix_timestamp()
> unix_timestamp(string date, string pattern) => to_unix_timestamp()
> We should clean up unix_timestamp() and points to to_unix_timestamp()
>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work stopped] (HIVE-25233) Removing deprecated unix_timestamp UDF

2021-09-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25233 stopped by Ashish Sharma.

> Removing deprecated unix_timestamp UDF
> --
>
> Key: HIVE-25233
> URL: https://issues.apache.org/jira/browse/HIVE-25233
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Description
> Since unix_timestamp() UDF was deprecated as part of 
> https://issues.apache.org/jira/browse/HIVE-10728. Internal 
> GenericUDFUnixTimeStamp extend GenericUDFToUnixTimeStamp and call 
> to_utc_timestamp() for unix_timestamp(string date) & unix_timestamp(string 
> date, string pattern).
> unix_timestamp()   => CURRENT_TIMESTAMP
> unix_timestamp(string date) => to_unix_timestamp()
> unix_timestamp(string date, string pattern) => to_unix_timestamp()
> We should clean up unix_timestamp() and points to to_unix_timestamp()
>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-25499) select unix_timestamp(dt) from table and select unix_timestamp(constant date) are different

2021-09-04 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409933#comment-17409933
 ] 

Ashish Sharma edited comment on HIVE-25499 at 9/4/21, 10:41 AM:


query - 
create table testdate(dt date);
insert into testdate values('0001-12-30');
select * from testdate;
select unix_timestamp(dt) from testdate;
select unix_timestamp('0001-12-30', '-MM-dd');

output - 
PREHOOK: query: create table testdate(dt date)
PREHOOK: type: CREATETABLE
PREHOOK: Output: database:default
PREHOOK: Output: default@testdate
POSTHOOK: query: create table testdate(dt date)
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@testdate
PREHOOK: query: insert into testdate values('0001-12-30')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
PREHOOK: Output: default@testdate
POSTHOOK: query: insert into testdate values('0001-12-30')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: Output: default@testdate
POSTHOOK: Lineage: testdate.dt SCRIPT []
PREHOOK: query: select * from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select * from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
0001-12-30
PREHOOK: query: select unix_timestamp(dt) from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select unix_timestamp(dt) from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
*-62104205222*
PREHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
*-62104205222*



was (Author: ashish-kumar-sharma):
query - 
create table testdate(dt date);
insert into testdate values('0001-12-30');
select * from testdate;
select unix_timestamp(dt) from testdate;
select unix_timestamp('0001-12-30', '-MM-dd');

output - 
PREHOOK: query: create table testdate(dt date)
PREHOOK: type: CREATETABLE
PREHOOK: Output: database:default
PREHOOK: Output: default@testdate
POSTHOOK: query: create table testdate(dt date)
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@testdate
PREHOOK: query: insert into testdate values('0001-12-30')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
PREHOOK: Output: default@testdate
POSTHOOK: query: insert into testdate values('0001-12-30')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: Output: default@testdate
POSTHOOK: Lineage: testdate.dt SCRIPT []
PREHOOK: query: select * from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select * from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
0001-12-30
PREHOOK: query: select unix_timestamp(dt) from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select unix_timestamp(dt) from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
-62104205222
PREHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
-62104205222


> select unix_timestamp(dt) from table and select unix_timestamp(constant date) 
>  are different
> 
>
> Key: HIVE-25499
> URL: https://issues.apache.org/jira/browse/HIVE-25499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: zhaolong
>Assignee: Ashish Sharma
>Priority: Major
>
> I found select unix_timestamp(date column) from table and select 
> unix_timestamp(constant date) are different in 3.1.2, for example:
> create table testdate(dt date);
> insert into testdate values('0001-12-30');
> select * from testdate; --> 0001-12-30
> select unix_timestamp(dt) from testdate; --> -62104233600
> select unix_timestamp('0001-12-30', '-MM-dd'); --> -62104406400
> the -62104233600 is different with -62104406400.
>  
> and convert timestap value is:
> select from_unixtime(-62104233600); --> 0002-01-01 00:00:00 , 62104233600 is  
> select unix_timestamp(date column) from table value which date is 0001-12-30.
> select from_unixtime(-62104406400); --> 0001-12-30 00:00:00
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-25499) select unix_timestamp(dt) from table and select unix_timestamp(constant date) are different

2021-09-04 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17409933#comment-17409933
 ] 

Ashish Sharma edited comment on HIVE-25499 at 9/4/21, 10:41 AM:


query - 
create table testdate(dt date);
insert into testdate values('0001-12-30');
select * from testdate;
select unix_timestamp(dt) from testdate;
select unix_timestamp('0001-12-30', '-MM-dd');

output - 
PREHOOK: query: create table testdate(dt date)
PREHOOK: type: CREATETABLE
PREHOOK: Output: database:default
PREHOOK: Output: default@testdate
POSTHOOK: query: create table testdate(dt date)
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@testdate
PREHOOK: query: insert into testdate values('0001-12-30')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
PREHOOK: Output: default@testdate
POSTHOOK: query: insert into testdate values('0001-12-30')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: Output: default@testdate
POSTHOOK: Lineage: testdate.dt SCRIPT []
PREHOOK: query: select * from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select * from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
0001-12-30
PREHOOK: query: select unix_timestamp(dt) from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
POSTHOOK: query: select unix_timestamp(dt) from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
-62104205222
PREHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
-62104205222



was (Author: ashish-kumar-sharma):
query - 
create table testdate(dt date);
insert into testdate values('0001-12-30');
select * from testdate;
select unix_timestamp(dt) from testdate;
select unix_timestamp('0001-12-30', '-MM-dd');

output - 
PREHOOK: query: create table testdate(dt date)
PREHOOK: type: CREATETABLE
PREHOOK: Output: database:default
PREHOOK: Output: default@testdate
POSTHOOK: query: create table testdate(dt date)
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@testdate
PREHOOK: query: insert into testdate values('0001-12-30')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
PREHOOK: Output: default@testdate
POSTHOOK: query: insert into testdate values('0001-12-30')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: Output: default@testdate
POSTHOOK: Lineage: testdate.dt SCRIPT []
PREHOOK: query: select * from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
 A masked pattern was here 
POSTHOOK: query: select * from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
 A masked pattern was here 
0001-12-30
PREHOOK: query: select unix_timestamp(dt) from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
 A masked pattern was here 
POSTHOOK: query: select unix_timestamp(dt) from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
 A masked pattern was here 
-62104205222
PREHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
POSTHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
-62104205222


> select unix_timestamp(dt) from table and select unix_timestamp(constant date) 
>  are different
> 
>
> Key: HIVE-25499
> URL: https://issues.apache.org/jira/browse/HIVE-25499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: zhaolong
>Assignee: Ashish Sharma
>Priority: Major
>
> I found select unix_timestamp(date column) from table and select 
> unix_timestamp(constant date) are different in 3.1.2, for example:
> create table testdate(dt date);
> insert into testdate values('0001-12-30');
> select * from testdate; --> 0001-12-30
> select unix_timestamp(dt) from testdate; --> -62104233600
> select unix_timestamp('0001-12-30', '-MM-dd'); --> -62104406400
> the -62104233600 is different with -62104406400.
>  
> and convert timestap value is:
> select from_unixtime(-62104233600); --> 0002-01-01 00:00:00 , 62104233600 is  
> select unix_timestamp(date column) from table value which date is 0001-12-30.
> select from_unixtime(-62104406400); --> 0001-12-30 00:00:00
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25499) select unix_timestamp(dt) from table and select unix_timestamp(constant date) are different

2021-09-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-25499.
--
Resolution: Fixed

query - 
create table testdate(dt date);
insert into testdate values('0001-12-30');
select * from testdate;
select unix_timestamp(dt) from testdate;
select unix_timestamp('0001-12-30', '-MM-dd');

output - 
PREHOOK: query: create table testdate(dt date)
PREHOOK: type: CREATETABLE
PREHOOK: Output: database:default
PREHOOK: Output: default@testdate
POSTHOOK: query: create table testdate(dt date)
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@testdate
PREHOOK: query: insert into testdate values('0001-12-30')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
PREHOOK: Output: default@testdate
POSTHOOK: query: insert into testdate values('0001-12-30')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
POSTHOOK: Output: default@testdate
POSTHOOK: Lineage: testdate.dt SCRIPT []
PREHOOK: query: select * from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
 A masked pattern was here 
POSTHOOK: query: select * from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
 A masked pattern was here 
0001-12-30
PREHOOK: query: select unix_timestamp(dt) from testdate
PREHOOK: type: QUERY
PREHOOK: Input: default@testdate
 A masked pattern was here 
POSTHOOK: query: select unix_timestamp(dt) from testdate
POSTHOOK: type: QUERY
POSTHOOK: Input: default@testdate
 A masked pattern was here 
-62104205222
PREHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
PREHOOK: type: QUERY
PREHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
POSTHOOK: query: select unix_timestamp('0001-12-30', '-MM-dd')
POSTHOOK: type: QUERY
POSTHOOK: Input: _dummy_database@_dummy_table
 A masked pattern was here 
-62104205222


> select unix_timestamp(dt) from table and select unix_timestamp(constant date) 
>  are different
> 
>
> Key: HIVE-25499
> URL: https://issues.apache.org/jira/browse/HIVE-25499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: zhaolong
>Assignee: Ashish Sharma
>Priority: Major
>
> I found select unix_timestamp(date column) from table and select 
> unix_timestamp(constant date) are different in 3.1.2, for example:
> create table testdate(dt date);
> insert into testdate values('0001-12-30');
> select * from testdate; --> 0001-12-30
> select unix_timestamp(dt) from testdate; --> -62104233600
> select unix_timestamp('0001-12-30', '-MM-dd'); --> -62104406400
> the -62104233600 is different with -62104406400.
>  
> and convert timestap value is:
> select from_unixtime(-62104233600); --> 0002-01-01 00:00:00 , 62104233600 is  
> select unix_timestamp(date column) from table value which date is 0001-12-30.
> select from_unixtime(-62104406400); --> 0001-12-30 00:00:00
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25499) select unix_timestamp(dt) from table and select unix_timestamp(constant date) are different

2021-09-04 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25499:


Assignee: Ashish Sharma

> select unix_timestamp(dt) from table and select unix_timestamp(constant date) 
>  are different
> 
>
> Key: HIVE-25499
> URL: https://issues.apache.org/jira/browse/HIVE-25499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: zhaolong
>Assignee: Ashish Sharma
>Priority: Major
>
> I found select unix_timestamp(date column) from table and select 
> unix_timestamp(constant date) are different in 3.1.2, for example:
> create table testdate(dt date);
> insert into testdate values('0001-12-30');
> select * from testdate; --> 0001-12-30
> select unix_timestamp(dt) from testdate; --> -62104233600
> select unix_timestamp('0001-12-30', '-MM-dd'); --> -62104406400
> the -62104233600 is different with -62104406400.
>  
> and convert timestap value is:
> select from_unixtime(-62104233600); --> 0002-01-01 00:00:00 , 62104233600 is  
> select unix_timestamp(date column) from table value which date is 0001-12-30.
> select from_unixtime(-62104406400); --> 0001-12-30 00:00:00
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25458) unix_timestamp() with string input give wrong result

2021-08-19 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25458:
-
Summary: unix_timestamp() with string input give wrong result  (was: Make 
Date/Timestamp parser from LENIENT to STRICT in unix_timestamp())

> unix_timestamp() with string input give wrong result
> 
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Description - 
> unix_timestamp() accept 4 value string/date/timestamp/timestamptz. Out of 
> which date/timestamp/timestamptz use DateTimeFormatter.class where as string 
> type use SimpleDateTimeformatter.class which cause difference is value.
> Example - 
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> 1800-11-08 01:35:15
> select from_unixtime(unix_timestamp(cast('1800-11-08 01:53:11' as 
> timestamp)));
> 1800-11-08 01:53:11
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25458) Make Date/Timestamp parser from LENIENT to STRICT in unix_timestamp()

2021-08-19 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25458:
-
Description: 
Description - 

unix_timestamp() accept 4 value string/date/timestamp/timestamptz. Out of which 
date/timestamp/timestamptz use DateTimeFormatter.class where as string type use 
SimpleDateTimeformatter.class which cause difference is value.

Example - 
select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
1800-11-08 01:35:15

select from_unixtime(unix_timestamp(cast('1800-11-08 01:53:11' as timestamp)));
1800-11-08 01:53:11
  

  was:
Description - 

set hive.local.time.zone=Asia/Bangkok;

select from_unixtime(unix_timestamp('1400-11-08 01:53:11'));
Result - 1400-11-17 01:35:15
Expected - 1400-11-08 01:53:11

select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
Result - 1800-11-08 01:35:15
Expected - 1800-11-08 01:53:11

select from_unixtime(unix_timestamp('1400-11-08 08:00:00 ICT', '-MM-dd 
HH:mm:ss z'));
Result 1400-11-17 07:42:04
Expected 1400-11-08 08:00:00 


select from_unixtime(unix_timestamp('1800-11-08 08:00:00 ICT', '-MM-dd 
HH:mm:ss z'));
Result - 1800-11-08 07:42:04
Expected - 1800-11-08 08:00:00 








> Make Date/Timestamp parser from LENIENT to STRICT in unix_timestamp()
> -
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Description - 
> unix_timestamp() accept 4 value string/date/timestamp/timestamptz. Out of 
> which date/timestamp/timestamptz use DateTimeFormatter.class where as string 
> type use SimpleDateTimeformatter.class which cause difference is value.
> Example - 
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> 1800-11-08 01:35:15
> select from_unixtime(unix_timestamp(cast('1800-11-08 01:53:11' as 
> timestamp)));
> 1800-11-08 01:53:11
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25458) Make Date/Timestamp parser from LENIENT to STRICT in unix_timestamp()

2021-08-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25458:
-
Summary: Make Date/Timestamp parser from LENIENT to STRICT in 
unix_timestamp()  (was: Combination of from_unixtime() and unix_timestamp() is 
giving wrong result)

> Make Date/Timestamp parser from LENIENT to STRICT in unix_timestamp()
> -
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Description - 
> set hive.local.time.zone=Asia/Bangkok;
> select from_unixtime(unix_timestamp('1400-11-08 01:53:11'));
> Result - 1400-11-17 01:35:15
> Expected - 1400-11-08 01:53:11
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> Result - 1800-11-08 01:35:15
> Expected - 1800-11-08 01:53:11
> select from_unixtime(unix_timestamp('1400-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result 1400-11-17 07:42:04
> Expected 1400-11-08 08:00:00 
> select from_unixtime(unix_timestamp('1800-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result - 1800-11-08 07:42:04
> Expected - 1800-11-08 08:00:00 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25458) Combination of from_unixtime() and unix_timestamp() is giving wrong result

2021-08-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25458:
-
Summary: Combination of from_unixtime() and unix_timestamp() is giving 
wrong result  (was: combination of from_unixtime and unix_timestamp is giving 
wrong result)

> Combination of from_unixtime() and unix_timestamp() is giving wrong result
> --
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Description - 
> set hive.local.time.zone=Asia/Bangkok;
> select from_unixtime(unix_timestamp('1400-11-08 01:53:11'));
> Result - 1400-11-17 01:35:15
> Expected - 1400-11-08 01:53:11
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> Result - 1800-11-08 01:35:15
> Expected - 1800-11-08 01:53:11
> select from_unixtime(unix_timestamp('1400-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result 1400-11-17 07:42:04
> Expected 1400-11-08 08:00:00 
> select from_unixtime(unix_timestamp('1800-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result - 1800-11-08 07:42:04
> Expected - 1800-11-08 08:00:00 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25458) combination of from_unixtime and unix_timestamp is giving wrong result

2021-08-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25458:


Assignee: Ashish Sharma

> combination of from_unixtime and unix_timestamp is giving wrong result
> --
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Description - 
> set hive.local.time.zone=Asia/Bangkok;
> select from_unixtime(unix_timestamp('1400-11-08 01:53:11'));
> Result - 1400-11-17 01:35:15
> Expected - 1400-11-08 01:53:11
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> Result - 1800-11-08 01:35:15
> Expected - 1800-11-08 01:53:11
> select from_unixtime(unix_timestamp('1400-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result 1400-11-17 07:42:04
> Expected 1400-11-08 08:00:00 
> select from_unixtime(unix_timestamp('1800-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result - 1800-11-08 07:42:04
> Expected - 1800-11-08 08:00:00 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25458) combination of from_unixtime and unix_timestamp is giving wrong result

2021-08-17 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25458 started by Ashish Sharma.

> combination of from_unixtime and unix_timestamp is giving wrong result
> --
>
> Key: HIVE-25458
> URL: https://issues.apache.org/jira/browse/HIVE-25458
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Description - 
> set hive.local.time.zone=Asia/Bangkok;
> select from_unixtime(unix_timestamp('1400-11-08 01:53:11'));
> Result - 1400-11-17 01:35:15
> Expected - 1400-11-08 01:53:11
> select from_unixtime(unix_timestamp('1800-11-08 01:53:11'));
> Result - 1800-11-08 01:35:15
> Expected - 1800-11-08 01:53:11
> select from_unixtime(unix_timestamp('1400-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result 1400-11-17 07:42:04
> Expected 1400-11-08 08:00:00 
> select from_unixtime(unix_timestamp('1800-11-08 08:00:00 ICT', '-MM-dd 
> HH:mm:ss z'));
> Result - 1800-11-08 07:42:04
> Expected - 1800-11-08 08:00:00 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25392) Refactor UDFToInteger to GenericUDFToInteger

2021-07-27 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25392:



> Refactor UDFToInteger to GenericUDFToInteger
> 
>
> Key: HIVE-25392
> URL: https://issues.apache.org/jira/browse/HIVE-25392
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Refactor UDFToInteger to move from UDF to GenericUDF.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-25392) Refactor UDFToInteger to GenericUDFToInteger

2021-07-27 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25392 started by Ashish Sharma.

> Refactor UDFToInteger to GenericUDFToInteger
> 
>
> Key: HIVE-25392
> URL: https://issues.apache.org/jira/browse/HIVE-25392
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Refactor UDFToInteger to move from UDF to GenericUDF.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-25306) Change Date/Timestamp parser from LENIENT to STRICT

2021-07-25 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387038#comment-17387038
 ] 

Ashish Sharma edited comment on HIVE-25306 at 7/26/21, 5:08 AM:


[~zabetak] Updated the linked items. Thank you for the review and merging the 
code !


was (Author: ashish-kumar-sharma):
[~zabetak] Thank you for the review!

> Change Date/Timestamp parser from LENIENT to STRICT
> ---
>
> Key: HIVE-25306
> URL: https://issues.apache.org/jira/browse/HIVE-25306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, UDF
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: backwards-compatibility, datetime, 
> pull-request-available, timestamp
> Fix For: 4.0.0
>
> Attachments: DB_compare.JPG
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Description - 
> Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to 
> convert the date/timpstamp from int,string,char etc to Date or Timestamp. 
> Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date 
> like "1992-13-12" is converted to "2000-01-12", 
> Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like 
> "1992-13-12" is not be converted instead NULL is return.
> https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-2557) Inserting illegal timestamp behaves differently that in MySql

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-2557.
-
Resolution: Duplicate

fixed as part of - https://issues.apache.org/jira/browse/HIVE-25306

> Inserting illegal timestamp behaves differently that in MySql
> -
>
> Key: HIVE-2557
> URL: https://issues.apache.org/jira/browse/HIVE-2557
> Project: Hive
>  Issue Type: Bug
>Reporter: Robert Surówka
>Assignee: Ashish Sharma
>Priority: Trivial
>
> In MySql there is:
> Illegal DATETIME, DATE, or TIMESTAMP values are converted to the "zero" value 
> of the appropriate type ('-00-00 00:00:00' or '-00-00'). ( 
> http://dev.mysql.com/doc/refman/5.1/en/datetime.html ), yet in Hive we have 
> e.g.:
> hive> insert into table rrt select '1970-01-51 00:00:05' from src_copy limit 
> 1;
> hive> select * from rrt;
> 1970-02-20 00:00:05
> It is probably something to be discussed whether to change it, but at least 
> we should be aware of this current inconsistency, especially when 
> documentation for timestamp will be added. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23752) Cast as Date for invalid date produce the valid output

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-23752.
--
  Assignee: Ashish Sharma
Resolution: Duplicate

fixed as part of https://issues.apache.org/jira/browse/HIVE-25306

> Cast as Date for invalid date produce the valid output
> --
>
> Key: HIVE-23752
> URL: https://issues.apache.org/jira/browse/HIVE-23752
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Rajkumar Singh
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: hive
>
> Hive-3:
> {code:java}
> select cast("-00-00" as date) 
> 0002-11-30 
> select cast("2010-27-54" as date)
>  2012-04-23
> select cast("1992-00-74" as date) ;
> 1992-02-12
> {code}
> The reason Hive allowing is because Parser formatted is set to LENIENT 
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/common/src/java/org/apache/hadoop/hive/common/type/Date.java#L50,
>  this seems to be an intentional change as changing the ResolverStyle to 
> STRICT start failing the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-2557) Inserting illegal timestamp behaves differently that in MySql

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-2557:
---

Assignee: Ashish Sharma

> Inserting illegal timestamp behaves differently that in MySql
> -
>
> Key: HIVE-2557
> URL: https://issues.apache.org/jira/browse/HIVE-2557
> Project: Hive
>  Issue Type: Bug
>Reporter: Robert Surówka
>Assignee: Ashish Sharma
>Priority: Trivial
>
> In MySql there is:
> Illegal DATETIME, DATE, or TIMESTAMP values are converted to the "zero" value 
> of the appropriate type ('-00-00 00:00:00' or '-00-00'). ( 
> http://dev.mysql.com/doc/refman/5.1/en/datetime.html ), yet in Hive we have 
> e.g.:
> hive> insert into table rrt select '1970-01-51 00:00:05' from src_copy limit 
> 1;
> hive> select * from rrt;
> 1970-02-20 00:00:05
> It is probably something to be discussed whether to change it, but at least 
> we should be aware of this current inconsistency, especially when 
> documentation for timestamp will be added. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23129) Cast invalid string to date returns incorrect result

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-23129.
--
Resolution: Duplicate

fixed as part of https://issues.apache.org/jira/browse/HIVE-25306

> Cast invalid string to date returns incorrect result
> 
>
> Key: HIVE-23129
> URL: https://issues.apache.org/jira/browse/HIVE-23129
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Yuming Wang
>Assignee: wenjun ma
>Priority: Major
>
> {noformat}
> hive> select cast('2020-20-20' as date);
> OK
> 2021-08-20
> Time taken: 4.436 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-17038) invalid result when CAST-ing to DATE

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-17038.
--
Resolution: Duplicate

fixed as part of https://issues.apache.org/jira/browse/HIVE-25306

> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}
> Some examples below where it really can produce weird results:
> {code}
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = '2017-06-31';
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = cast('2017-06-31' as date);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-17038) invalid result when CAST-ing to DATE

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-17038:


Assignee: Ashish Sharma

> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}
> Some examples below where it really can produce weird results:
> {code}
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = '2017-06-31';
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = cast('2017-06-31' as date);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25056) cast ('000-00-00 00:00:00' as timestamp/datetime) results in wrong conversion

2021-07-25 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-25056.
--
Resolution: Duplicate

Fixed as part of - https://issues.apache.org/jira/browse/HIVE-25306

> cast ('000-00-00 00:00:00' as timestamp/datetime) results in wrong conversion 
> --
>
> Key: HIVE-25056
> URL: https://issues.apache.org/jira/browse/HIVE-25056
> Project: Hive
>  Issue Type: Bug
>Reporter: Anurag Shekhar
>Assignee: Anurag Shekhar
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> select cast ('-00-00' as date) , cast ('000-00-00 00:00:00' as timestamp) 
> +--+---+
> |_c0|_c1|
> +--+---+
> |0002-11-30|0002-11-30 00:00:00.0|
> +--+---+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25306) Change Date/Timestamp parser from LENIENT to STRICT

2021-07-25 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387038#comment-17387038
 ] 

Ashish Sharma commented on HIVE-25306:
--

[~zabetak] Thank you for the review!

> Change Date/Timestamp parser from LENIENT to STRICT
> ---
>
> Key: HIVE-25306
> URL: https://issues.apache.org/jira/browse/HIVE-25306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, UDF
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: backwards-compatibility, datetime, 
> pull-request-available, timestamp
> Fix For: 4.0.0
>
> Attachments: DB_compare.JPG
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Description - 
> Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to 
> convert the date/timpstamp from int,string,char etc to Date or Timestamp. 
> Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date 
> like "1992-13-12" is converted to "2000-01-12", 
> Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like 
> "1992-13-12" is not be converted instead NULL is return.
> https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT

2021-07-20 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383902#comment-17383902
 ] 

Ashish Sharma commented on HIVE-25306:
--

 !DB_compare.JPG! 

>From the comparison it is quite clear that date and timestamp formatting was 
>much tighter in older versions of HIVE. For most of the wrong date input NULL 
>was the standard response instead of Exception. 

Also when I went through the code I found that. While doing the Vector 
implementation of some of the date related UDF like datediff etc. MySql was 
taken as the [gold 
standard|https://issues.apache.org/jira/browse/HIVE-15338?focusedCommentId=15727553=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15727553].
 So it make more sense that  we should comply with MySql as we already refer 
MySql as gold standard and returning NULL as result for wrong dates in cast is 
also [documented 
|https://cwiki.apache.org/confluence/display/hive/languagemanual+types#LanguageManualTypes-CastingDates]

So I propose to make NULL as the standard response for all parsing errors.

> Move Date and Timestamp parsing from ResolverStyle.LENIENT to 
> ResolverStyle.STRICT
> --
>
> Key: HIVE-25306
> URL: https://issues.apache.org/jira/browse/HIVE-25306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, UDF
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: DB_compare.JPG
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description - 
> Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to 
> convert the date/timpstamp from int,string,char etc to Date or Timestamp. 
> Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date 
> like "1992-13-12" is converted to "2000-01-12", 
> Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like 
> "1992-13-12" is not be converted instead NULL is return.
> https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT

2021-07-20 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25306:
-
Attachment: DB_compare.JPG

> Move Date and Timestamp parsing from ResolverStyle.LENIENT to 
> ResolverStyle.STRICT
> --
>
> Key: HIVE-25306
> URL: https://issues.apache.org/jira/browse/HIVE-25306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, UDF
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: DB_compare.JPG
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description - 
> Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to 
> convert the date/timpstamp from int,string,char etc to Date or Timestamp. 
> Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date 
> like "1992-13-12" is converted to "2000-01-12", 
> Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like 
> "1992-13-12" is not be converted instead NULL is return.
> https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25351) stddev(), sstddev_pop() with CBO enable returning null

2021-07-19 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25351:
-
Description: 
*script used to repro*

create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
decimal(30,2));

insert into cbo_test values ("00140006375905", 10230.72, 
10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69);

select stddev(v1), stddev(v2), stddev(v3) from cbo_test;


*Enable CBO*
++
|  Explain   |
++
| Plan optimized by CBO. |
||
| Vertex dependency in root stage|
| Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Stage-1|
|   Reducer 2 vectorized |
|   File Output Operator [FS_13] |
| Select Operator [SEL_12] (rows=1 width=24) |
|   Output:["_col0","_col1","_col2"] |
|   Group By Operator [GBY_11] (rows=1 width=72) |
| 
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)","count(VALUE._col2)","sum(VALUE._col3)","sum(VALUE._col4)","count(VALUE._col5)","sum(VALUE._col6)","sum(VALUE._col7)","count(VALUE._col8)"]
 |
|   <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized  |
| PARTITION_ONLY_SHUFFLE [RS_10] |
|   Group By Operator [GBY_9] (rows=1 width=72) |
| 
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(_col3)","sum(_col0)","count(_col0)","sum(_col5)","sum(_col4)","count(_col1)","sum(_col7)","sum(_col6)","count(_col2)"]
 |
| Select Operator [SEL_8] (rows=6 width=232) |
|   
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7"] |
|   TableScan [TS_0] (rows=6 width=232) |
| default@cbo_test,cbo_test, ACID 
table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
||
++

*result* 

_c0 _c1 _c2
0.0 NaN NaN



*Disable CBO*
++
|  Explain   |
++
| Vertex dependency in root stage|
| Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Stage-1|
|   Reducer 2 vectorized |
|   File Output Operator [FS_11] |
| Group By Operator [GBY_10] (rows=1 width=24) |
|   
Output:["_col0","_col1","_col2"],aggregations:["stddev(VALUE._col0)","stddev(VALUE._col1)","stddev(VALUE._col2)"]
 |
| <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized|
|   PARTITION_ONLY_SHUFFLE [RS_9]|
| Group By Operator [GBY_8] (rows=1 width=240) |
|   
Output:["_col0","_col1","_col2"],aggregations:["stddev(v1)","stddev(v2)","stddev(v3)"]
 |
|   Select Operator [SEL_7] (rows=6 width=232) |
| Output:["v1","v2","v3"]|
| TableScan [TS_0] (rows=6 width=232) |
|   default@cbo_test,cbo_test, ACID 
table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
||
++


*result*  

_c0 _c1 _c2
5.42317860890711E-135.42317860890711E-135.42317860890711E-13

  was:
*script used to repro*

create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
decimal(30,2));

insert into cbo_test values ("00140006375905", 10230.72, 
10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72,

[jira] [Updated] (HIVE-25351) stddev(), sstddev_pop() with CBO enable returning null

2021-07-19 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma updated HIVE-25351:
-
Description: 
*script used to repro*

create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
decimal(30,2));

insert into cbo_test values ("00140006375905", 10230.72, 
10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905", 10230.72, 10230.72, 10230.69);

select stddev(v1), stddev(v2), stddev(v3) from cbo_test;


*Enable CBO*
++
|  Explain   |
++
| Plan optimized by CBO. |
||
| Vertex dependency in root stage|
| Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Stage-1|
|   Reducer 2 vectorized |
|   File Output Operator [FS_13] |
| Select Operator [SEL_12] (rows=1 width=24) |
|   Output:["_col0","_col1","_col2"] |
|   Group By Operator [GBY_11] (rows=1 width=72) |
| 
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)","count(VALUE._col2)","sum(VALUE._col3)","sum(VALUE._col4)","count(VALUE._col5)","sum(VALUE._col6)","sum(VALUE._col7)","count(VALUE._col8)"]
 |
|   <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized  |
| PARTITION_ONLY_SHUFFLE [RS_10] |
|   Group By Operator [GBY_9] (rows=1 width=72) |
| 
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"],aggregations:["sum(_col3)","sum(_col0)","count(_col0)","sum(_col5)","sum(_col4)","count(_col1)","sum(_col7)","sum(_col6)","count(_col2)"]
 |
| Select Operator [SEL_8] (rows=6 width=232) |
|   
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7"] |
|   TableScan [TS_0] (rows=6 width=232) |
| default@cbo_test,cbo_test, ACID 
table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
||
++

*Query Result* 

_c0 _c1 _c2
0.0 NaN NaN



*Disable CBO*
++
|  Explain   |
++
| Vertex dependency in root stage|
| Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)|
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Stage-1|
|   Reducer 2 vectorized |
|   File Output Operator [FS_11] |
| Group By Operator [GBY_10] (rows=1 width=24) |
|   
Output:["_col0","_col1","_col2"],aggregations:["stddev(VALUE._col0)","stddev(VALUE._col1)","stddev(VALUE._col2)"]
 |
| <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized|
|   PARTITION_ONLY_SHUFFLE [RS_9]|
| Group By Operator [GBY_8] (rows=1 width=240) |
|   
Output:["_col0","_col1","_col2"],aggregations:["stddev(v1)","stddev(v2)","stddev(v3)"]
 |
|   Select Operator [SEL_7] (rows=6 width=232) |
| Output:["v1","v2","v3"]|
| TableScan [TS_0] (rows=6 width=232) |
|   default@cbo_test,cbo_test, ACID 
table,Tbl:COMPLETE,Col:COMPLETE,Output:["v1","v2","v3"] |
||
++


*Query Result*  

_c0 _c1 _c2
5.42317860890711E-135.42317860890711E-135.42317860890711E-13

  was:
*script used to repro*

create table cbo_test (key string, v1 double, v2 decimal(30,2), v3 
decimal(30,2));

insert into cbo_test values ("00140006375905", 10230.72, 
10230.72, 10230.69), ("00140006375905", 10230.72, 10230.72, 
10230.69), ("00140006375905", 10230.72, 10230.72, 10230.69), 
("00140006375905",

1 2 3 >

1 - 100 of 290 matches

Mail list logo