[jira] [Commented] (HIVE-21631) Enhance metastore API to allow bulk-loading materialized views

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823654#comment-16823654
 ] 

Hive QA commented on HIVE-21631:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12966675/HIVE-21631.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 15823 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreEventListenerWithOldConf - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestRemoteHiveMetaStoreZKBindHost - did not produce a TEST-*.xml file (likely 
timed out) (batchId=231)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17008/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17008/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17008/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12966675 - PreCommit-HIVE-Build

> Enhance metastore API to allow bulk-loading materialized views
> --
>
> Key: HIVE-21631
> URL: https://issues.apache.org/jira/browse/HIVE-21631
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Craig Condit
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21631.patch
>
>
> Currently, every query in HS2 results in a metastore call per database to 
> retrieve all materialized views. This causes severe performance degradation 
> on multi-tenant clusters with thousands of databases (very similar to how the 
> old get_function() metastore call didn't scale).
> We should add a metastore call which can retrieve all materialized view 
> definitions at once (for all DBs) so that we don't have to make thousands of 
> metastore calls per query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21631) Enhance metastore API to allow bulk-loading materialized views

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823646#comment-16823646
 ] 

Hive QA commented on HIVE-21631:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
36s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
13s{color} | {color:blue} standalone-metastore/metastore-server in master has 
181 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 2256 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
23s{color} | {color:red} ql generated 2 new + 2255 unchanged - 1 fixed = 2257 
total (was 2256) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Exception is caught when Exception is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(List, List, 
boolean, HiveTxnManager)  At Hive.java:is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(List, List, 
boolean, HiveTxnManager)  At Hive.java:[line 1677] |
|  |  Private method 
org.apache.hadoop.hive.ql.metadata.Hive.getTableObjects(String, List) is never 
called  At Hive.java:never called  At Hive.java:[lines 1450-1459] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17008/dev-support/hive-personality.sh
 |
| git revision | master / b58d50c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17008/yetus/new-findbugs-ql.html
 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server ql itests/hcatalog-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17008/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enhance metastore API to allow bulk-loading materialized views
> --
>
> Key: HIVE-21631
> URL: https://issues.apache.org/jira/browse/HIVE-21631
> 

[jira] [Updated] (HIVE-21631) Enhance metastore API to allow bulk-loading materialized views

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21631:
---
Attachment: HIVE-21631.patch

> Enhance metastore API to allow bulk-loading materialized views
> --
>
> Key: HIVE-21631
> URL: https://issues.apache.org/jira/browse/HIVE-21631
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Craig Condit
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21631.patch
>
>
> Currently, every query in HS2 results in a metastore call per database to 
> retrieve all materialized views. This causes severe performance degradation 
> on multi-tenant clusters with thousands of databases (very similar to how the 
> old get_function() metastore call didn't scale).
> We should add a metastore call which can retrieve all materialized view 
> definitions at once (for all DBs) so that we don't have to make thousands of 
> metastore calls per query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21631) Enhance metastore API to allow bulk-loading materialized views

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21631:
---
Status: Patch Available  (was: In Progress)

> Enhance metastore API to allow bulk-loading materialized views
> --
>
> Key: HIVE-21631
> URL: https://issues.apache.org/jira/browse/HIVE-21631
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.1.1, 3.2.0
>Reporter: Craig Condit
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, every query in HS2 results in a metastore call per database to 
> retrieve all materialized views. This causes severe performance degradation 
> on multi-tenant clusters with thousands of databases (very similar to how the 
> old get_function() metastore call didn't scale).
> We should add a metastore call which can retrieve all materialized view 
> definitions at once (for all DBs) so that we don't have to make thousands of 
> metastore calls per query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-21631) Enhance metastore API to allow bulk-loading materialized views

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-21631 started by Jesus Camacho Rodriguez.
--
> Enhance metastore API to allow bulk-loading materialized views
> --
>
> Key: HIVE-21631
> URL: https://issues.apache.org/jira/browse/HIVE-21631
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, Metastore
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Craig Condit
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, every query in HS2 results in a metastore call per database to 
> retrieve all materialized views. This causes severe performance degradation 
> on multi-tenant clusters with thousands of databases (very similar to how the 
> old get_function() metastore call didn't scale).
> We should add a metastore call which can retrieve all materialized view 
> definitions at once (for all DBs) so that we don't have to make thousands of 
> metastore calls per query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21643:
--
Labels: pull-request-available  (was: )

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21643.1.patch
>
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21643?focusedWorklogId=230986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230986
 ]

ASF GitHub Bot logged work on HIVE-21643:
-

Author: ASF GitHub Bot
Created on: 23/Apr/19 01:37
Start Date: 23/Apr/19 01:37
Worklog Time Spent: 10m 
  Work Description: rnatarajan commented on pull request #604: HIVE-21643: 
Fix broken support for zones in ISO Date.
URL: https://github.com/apache/hive/pull/604
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230986)
Time Spent: 10m
Remaining Estimate: 0h

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-21643.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread RAJKAMAL (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823607#comment-16823607
 ] 

RAJKAMAL commented on HIVE-21643:
-

[~jcamachorodriguez] Can you please look at the patch ?

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
> Attachments: HIVE-21643.1.patch
>
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread RAJKAMAL (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

RAJKAMAL updated HIVE-21643:

Attachment: (was: udf_fix.patch)

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
> Attachments: HIVE-21643.1.patch
>
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread RAJKAMAL (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

RAJKAMAL updated HIVE-21643:

Attachment: HIVE-21643.1.patch

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
> Attachments: HIVE-21643.1.patch
>
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21643) Fix Broken support for ISO Time with Zone in Hive UDFs

2019-04-22 Thread RAJKAMAL (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

RAJKAMAL updated HIVE-21643:

Priority: Minor  (was: Major)

> Fix Broken support for ISO Time with Zone in Hive UDFs
> --
>
> Key: HIVE-21643
> URL: https://issues.apache.org/jira/browse/HIVE-21643
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: RAJKAMAL
>Priority: Minor
> Attachments: udf_fix.patch
>
>
> The followings UDFs date_format and to_date used to support ISO dates with 
> timezone and the support has been broken since Hive 3.x release.
> Example:
> date_format('2017-03-16T00:10:42Z', 'y')
> date_format('2017-03-16T00:10:42+01:00', 'y')
> date_format('2017-03-16T00:10:42-01:00', 'y')
> to_date('2015-04-11T01:30:45Z')
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21634) Materialized view rewriting over aggregate operators containing with grouping sets

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21634?focusedWorklogId=230968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230968
 ]

ASF GitHub Bot logged work on HIVE-21634:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 23:55
Start Date: 22/Apr/19 23:55
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #602: HIVE-21634
URL: https://github.com/apache/hive/pull/602#discussion_r277470769
 
 

 ##
 File path: ql/src/test/queries/clientpositive/perf/mv_query67.q
 ##
 @@ -0,0 +1,154 @@
+set hive.mapred.mode=nonstrict;
+set hive.materializedview.rewriting.time.window=-1;
+
+CREATE MATERIALIZED VIEW `my_materialized_view_n100` AS
+select i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, 
s_store_id, sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales
+from store_sales, date_dim, store, item
+where  ss_sold_date_sk=d_date_sk
+and ss_item_sk=i_item_sk
+and ss_store_sk = s_store_sk
+and d_month_seq between 1212 and 1212+11
+group by i_category, i_class, i_brand, i_product_name, d_year, d_qoy, 
d_moy,s_store_id;
+
+-- start query 1 in stream 0 using template query67.tpl and seed 1819994127
+explain cbo
+select  *
+from (select i_category
+,i_class
+,i_brand
+,i_product_name
+,d_year
+,d_qoy
+,d_moy
+,s_store_id
+,sumsales
+,rank() over (partition by i_category order by sumsales desc) rk
+  from (select i_category
+  ,i_class
+  ,i_brand
+  ,i_product_name
+  ,d_year
+  ,d_qoy
+  ,d_moy
+  ,s_store_id
+  ,sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales
+from store_sales
+,date_dim
+,store
+,item
+   where  ss_sold_date_sk=d_date_sk
+  and ss_item_sk=i_item_sk
+  and ss_store_sk = s_store_sk
+  and d_month_seq between 1212 and 1212+11
+   group by i_category, i_class, i_brand, i_product_name, d_year, d_qoy, 
d_moy,s_store_id)dw1) dw2
+where rk <= 100
+order by i_category
+,i_class
+,i_brand
+,i_product_name
+,d_year
+,d_qoy
+,d_moy
+,s_store_id
+,sumsales
+,rk
+limit 100;
+
+-- end query 1 in stream 0 using template query67.tpl
+
+explain cbo
+select  *
+from (select i_category
+,i_class
+,i_brand
+,i_product_name
+,d_year
+,d_qoy
+,d_moy
+,s_store_id
+,sumsales
+,rank() over (partition by i_category order by sumsales desc) rk
+  from (select i_category
+  ,i_class
+  ,i_brand
+  ,i_product_name
+  ,d_year
+  ,d_qoy
+  ,d_moy
+  ,s_store_id
+  ,sum(sumsales) sumsales
+from (select i_category
+  ,i_class
+  ,i_brand
+  ,i_product_name
+  ,d_year
+  ,d_qoy
+  ,d_moy
+  ,s_store_id
+  ,sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales
+from store_sales
+  ,date_dim
+  ,store
+  ,item
+where  ss_sold_date_sk=d_date_sk
+  and ss_item_sk=i_item_sk
+  and ss_store_sk = s_store_sk
+  and d_month_seq between 1212 and 1212+11
+group by i_category, i_class, i_brand, i_product_name, d_year, 
d_qoy, d_moy,s_store_id
+) dw0
+   group by rollup(i_category, i_class, i_brand, i_product_name, d_year, 
d_qoy, d_moy,s_store_id))dw1) dw2
+where rk <= 100
+order by i_category
+,i_class
+,i_brand
+,i_product_name
+,d_year
+,d_qoy
+,d_moy
+,s_store_id
+,sumsales
+,rk
+limit 100;
+
+explain cbo
+select  *
+from (select i_category
+,i_class
+,i_brand
+,i_product_name
+,d_year
+,d_qoy
+,d_moy
+,s_store_id
+,sumsales
+,rank() over (partition by i_category order by sumsales desc) rk
+  from (select i_category
+  ,i_class
+  ,i_brand
+  ,i_product_name
+  ,d_year
+  ,d_qoy
+  ,d_moy
+  ,s_store_id
+  ,sum(coalesce(ss_sales_price*ss_quantity,0)) sumsales
+from store_sales
+,date_dim
+,store
+,item
+   where  ss_sold_date_sk=d_date_sk
+  

[jira] [Work logged] (HIVE-21634) Materialized view rewriting over aggregate operators containing with grouping sets

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21634?focusedWorklogId=230967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230967
 ]

ASF GitHub Bot logged work on HIVE-21634:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 23:55
Start Date: 22/Apr/19 23:55
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #602: HIVE-21634
URL: https://github.com/apache/hive/pull/602#discussion_r277472371
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateSplitRule.java
 ##
 @@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.optimizer.calcite.rules;
+
+import com.google.common.collect.ImmutableList;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.rel.core.Aggregate;
+import org.apache.calcite.rel.core.Aggregate.Group;
+import org.apache.calcite.rel.core.AggregateCall;
+import org.apache.calcite.sql.SqlAggFunction;
+import org.apache.calcite.tools.RelBuilder;
+import org.apache.calcite.tools.RelBuilderFactory;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelBuilder;
+import org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveGroupingID;
+
+/**
+ * Rule that matches an aggregate with grouping sets and splits it into an 
aggregate
+ * without grouping sets (bottom) and an aggregate with grouping sets (top).
+ */
+public class HiveAggregateSplitRule extends RelOptRule {
+
+  public static final HiveAggregateSplitRule INSTANCE =
+  new HiveAggregateSplitRule(HiveAggregate.class, 
HiveRelFactories.HIVE_BUILDER);
+
+  private HiveAggregateSplitRule(Class aggregateClass,
+  RelBuilderFactory relBuilderFactory) {
+super(
+operandJ(aggregateClass, null, agg -> agg.getGroupType() != 
Group.SIMPLE, any()),
+relBuilderFactory, null);
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+final Aggregate aggregate = call.rel(0);
+final RelBuilder relBuilder = call.builder();
+
+// If any aggregate is distinct, bail out
+// If any aggregate is the grouping id, bail out
+// If any aggregate call has a filter, bail out
+// If any aggregate functions do not support splitting, bail out
+final List topAggFunctions = new ArrayList<>();
+for (AggregateCall aggregateCall : aggregate.getAggCallList()) {
+  if (aggregateCall.isDistinct()) {
+return;
+  }
+  if (aggregateCall.getAggregation().equals(HiveGroupingID.INSTANCE)) {
+return;
+  }
+  if (aggregateCall.filterArg >= 0) {
+return;
+  }
+  SqlAggFunction aggFunction =
+  HiveRelBuilder.getRollup(aggregateCall.getAggregation());
+  if (aggFunction == null) {
+return;
+  }
+  topAggFunctions.add(aggFunction);
+}
+
+final ImmutableBitSet bottomAggregateGroupSet = aggregate.getGroupSet();
+if 
(aggregate.getCluster().getMetadataQuery().areColumnsUnique(aggregate.getInput(),
 bottomAggregateGroupSet)) {
+  // Nothing to do, probably already pushed
+  return;
+}
+
+final ImmutableBitSet topAggregateGroupSet = ImmutableBitSet.range(0, 
bottomAggregateGroupSet.cardinality());
+
+final Map map = new HashMap<>();
+bottomAggregateGroupSet.forEach(k -> map.put(k, map.size()));
+ImmutableList topAggregateGroupSets = 
ImmutableBitSet.ORDERING.immutableSortedCopy(
+ImmutableBitSet.permute(aggregate.groupSets, map));
+
+final List topAggregateCalls = new ArrayList<>();
+for (int i = 0; i < aggregate.getAggCallList().size(); i++) {
+  AggregateCall aggregateCall = aggregate.getAggCallList().get(i);
 
 Review comment:
   Instead of looping over agg call list again I believe you can do this in the 
first 

[jira] [Commented] (HIVE-21642) Hive server leaks memory on data insertion

2019-04-22 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823524#comment-16823524
 ] 

Gopal V commented on HIVE-21642:


This does not leak memory on Hive master anymore, I recommend opening a ticket 
with EMR for 2.3.x branch fixes.

> Hive server leaks memory on data insertion
> --
>
> Key: HIVE-21642
> URL: https://issues.apache.org/jira/browse/HIVE-21642
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
> Environment: * Amazon Hadoop Distribution emr-5.20.0
>  * Master mode with 4 CPU and 16 GB RAM each
>  * Table files stored in S3 cloud storage
>Reporter: Alexander Knopov
>Priority: Major
>
> We are continuously loading data into Hive table stored in ORC format by 
> appending data in batches. We repeatedly have seen that over a span of few 
> days Hive server experience {{OutOfMemoryError}} exceptions that we believe 
> are caused by memory leaks.
> Comparing heap dumps shows that most suspicious classes that show persistent 
> growth are and not recycled with GC are
>  * {{org.apache.hadoop.hive.ql.io.orc.OrcStruct$Field}}
>  * 
> {{org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField}}
>  * {{String}}
> Sample program used for stress test and heap dumps from 700 to 2500 GB can be 
> uploaded on request. They are too big for Jira backing store



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21642) Hive server leaks memory on data insertion

2019-04-22 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823524#comment-16823524
 ] 

Gopal V edited comment on HIVE-21642 at 4/22/19 10:38 PM:
--

This does not leak memory on Hive master anymore (~5 min expiry), I recommend 
opening a ticket with EMR for 2.3.x branch fixes.


was (Author: gopalv):
This does not leak memory on Hive master anymore, I recommend opening a ticket 
with EMR for 2.3.x branch fixes.

> Hive server leaks memory on data insertion
> --
>
> Key: HIVE-21642
> URL: https://issues.apache.org/jira/browse/HIVE-21642
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
> Environment: * Amazon Hadoop Distribution emr-5.20.0
>  * Master mode with 4 CPU and 16 GB RAM each
>  * Table files stored in S3 cloud storage
>Reporter: Alexander Knopov
>Priority: Major
>
> We are continuously loading data into Hive table stored in ORC format by 
> appending data in batches. We repeatedly have seen that over a span of few 
> days Hive server experience {{OutOfMemoryError}} exceptions that we believe 
> are caused by memory leaks.
> Comparing heap dumps shows that most suspicious classes that show persistent 
> growth are and not recycled with GC are
>  * {{org.apache.hadoop.hive.ql.io.orc.OrcStruct$Field}}
>  * 
> {{org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField}}
>  * {{String}}
> Sample program used for stress test and heap dumps from 700 to 2500 GB can be 
> uploaded on request. They are too big for Jira backing store



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

2019-04-22 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823504#comment-16823504
 ] 

Todd Lipcon commented on HIVE-21506:


bq.  My understanding is that we are not yet blocked by the concurrency checks 
when acquiring locks, but the bottleneck is simply the number of HMS/RDBMS 
calls implementing that.

Agreed with that, and the general idea that we should understand the workload. 
That said, I don't know that we need a specific workload to agree on the 
central observation that most queries against Hive are read-only, given our 
focus on warehousing and datamart applications (Hive isn't an OLTP database by 
any stretch). I did a spot check on the ratio of DML to read-only queries in 
some customer profile datasets I have, and they range from a 300:1 ratio for 
some customers down to about a 1:1 ratio. Average is 7:1. 

> Memory based TxnHandler implementation
> --
>
> Key: HIVE-21506
> URL: https://issues.apache.org/jira/browse/HIVE-21506
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Peter Vary
>Priority: Major
>
> The current TxnHandler implementations are using the backend RDBMS to store 
> every Hive lock and transaction data, so multiple TxnHandler instances can 
> run simultaneously and can serve requests. The continuous 
> communication/locking done on the RDBMS side puts serious load on the backend 
> databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current 
> design HMS) instance then we can provide much better (using only java based 
> locking) performance. We still have to store the committed write transactions 
> to the RDBMS (or later some other persistent storage), but other lock and 
> transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose 
> scalability when one instance of TxnHandler is no longer able to serve the 
> requests (see NameNode), and fault tolerance in the sense that the ongoing 
> transactions should be terminated when the TxnHandler is failed. If this 
> drawbacks are acceptable in certain situations the we can provide better 
> throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823490#comment-16823490
 ] 

Hive QA commented on HIVE-21240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12966646/HIVE-21240.12.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 15833 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreEventListenerWithOldConf - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestRemoteHiveMetaStoreZKBindHost - did not produce a TEST-*.xml file (likely 
timed out) (batchId=231)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17007/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17007/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17007/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12966646 - PreCommit-HIVE-Build

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.12.patch, HIVE-21240.2.patch, HIVE-21240.3.patch, 
> HIVE-21240.4.patch, HIVE-21240.5.patch, HIVE-21240.6.patch, 
> HIVE-21240.7.patch, HIVE-21240.9.patch, HIVE-24240.8.patch, 
> kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823459#comment-16823459
 ] 

Hive QA commented on HIVE-21240:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2256 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} hcatalog/core in master has 28 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} serde: The patch generated 0 new + 4 unchanged - 25 
fixed = 4 total (was 29) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} ql: The patch generated 0 new + 6 unchanged - 5 
fixed = 6 total (was 11) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} serde generated 0 new + 193 unchanged - 4 fixed = 
193 total (was 197) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
19s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17007/dev-support/hive-personality.sh
 |
| git revision | master / b58d50c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde ql hcatalog/core U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17007/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
> 

[jira] [Commented] (HIVE-21619) Print timestamp type without precision in SQL explain extended

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823411#comment-16823411
 ] 

Hive QA commented on HIVE-21619:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12966643/HIVE-21619.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 15823 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreEventListenerWithOldConf - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestRemoteHiveMetaStoreZKBindHost - did not produce a TEST-*.xml file (likely 
timed out) (batchId=231)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17006/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17006/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17006/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12966643 - PreCommit-HIVE-Build

> Print timestamp type without precision in SQL explain extended
> --
>
> Key: HIVE-21619
> URL: https://issues.apache.org/jira/browse/HIVE-21619
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.01.patch, HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.02.patch, HIVE-21619.patch
>
>
> Hive dialect should print timestamp type without precision in generated SQL, 
> since currently Hive does not support user-defined precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21619) Print timestamp type without precision in SQL explain extended

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823364#comment-16823364
 ] 

Hive QA commented on HIVE-21619:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2256 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17006/dev-support/hive-personality.sh
 |
| git revision | master / b58d50c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17006/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Print timestamp type without precision in SQL explain extended
> --
>
> Key: HIVE-21619
> URL: https://issues.apache.org/jira/browse/HIVE-21619
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.01.patch, HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.02.patch, HIVE-21619.patch
>
>
> Hive dialect should print timestamp type without precision in generated SQL, 
> since currently Hive does not support user-defined precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Status: Open  (was: Patch Available)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.12.patch, HIVE-21240.2.patch, HIVE-21240.3.patch, 
> HIVE-21240.4.patch, HIVE-21240.5.patch, HIVE-21240.6.patch, 
> HIVE-21240.7.patch, HIVE-21240.9.patch, HIVE-24240.8.patch, 
> kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Status: Patch Available  (was: Open)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.12.patch, HIVE-21240.2.patch, HIVE-21240.3.patch, 
> HIVE-21240.4.patch, HIVE-21240.5.patch, HIVE-21240.6.patch, 
> HIVE-21240.7.patch, HIVE-21240.9.patch, HIVE-24240.8.patch, 
> kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Attachment: HIVE-21240.12.patch

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.12.patch, HIVE-21240.2.patch, HIVE-21240.3.patch, 
> HIVE-21240.4.patch, HIVE-21240.5.patch, HIVE-21240.6.patch, 
> HIVE-21240.7.patch, HIVE-21240.9.patch, HIVE-24240.8.patch, 
> kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21619) Print timestamp type without precision in SQL explain extended

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21619:
---
Attachment: HIVE-21619.02.patch

> Print timestamp type without precision in SQL explain extended
> --
>
> Key: HIVE-21619
> URL: https://issues.apache.org/jira/browse/HIVE-21619
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.01.patch, HIVE-21619.01.patch, HIVE-21619.01.patch, 
> HIVE-21619.02.patch, HIVE-21619.patch
>
>
> Hive dialect should print timestamp type without precision in generated SQL, 
> since currently Hive does not support user-defined precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21633) Estimate range for value generated by aggregate function in statistics annotation

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823324#comment-16823324
 ] 

Jesus Camacho Rodriguez commented on HIVE-21633:


[~gopalv], [~kgyrtkirk], can you review this PR?
https://github.com/apache/hive/pull/603

Thanks

> Estimate range for value generated by aggregate function in statistics 
> annotation
> -
>
> Key: HIVE-21633
> URL: https://issues.apache.org/jira/browse/HIVE-21633
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21633.02.patch, HIVE-21633.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some cases, we can infer the estimate of the range for a value generated 
> by an aggregate function during statistics annotation. For instance, we can 
> estimate the min of the sum of a column with positive min value as that same 
> min value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21633) Estimate range for value generated by aggregate function in statistics annotation

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21633:
--
Labels: pull-request-available  (was: )

> Estimate range for value generated by aggregate function in statistics 
> annotation
> -
>
> Key: HIVE-21633
> URL: https://issues.apache.org/jira/browse/HIVE-21633
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21633.02.patch, HIVE-21633.03.patch
>
>
> In some cases, we can infer the estimate of the range for a value generated 
> by an aggregate function during statistics annotation. For instance, we can 
> estimate the min of the sum of a column with positive min value as that same 
> min value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21633) Estimate range for value generated by aggregate function in statistics annotation

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21633?focusedWorklogId=230823=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230823
 ]

ASF GitHub Bot logged work on HIVE-21633:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 18:28
Start Date: 22/Apr/19 18:28
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #603: HIVE-21633
URL: https://github.com/apache/hive/pull/603
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230823)
Time Spent: 10m
Remaining Estimate: 0h

> Estimate range for value generated by aggregate function in statistics 
> annotation
> -
>
> Key: HIVE-21633
> URL: https://issues.apache.org/jira/browse/HIVE-21633
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21633.02.patch, HIVE-21633.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some cases, we can infer the estimate of the range for a value generated 
> by an aggregate function during statistics annotation. For instance, we can 
> estimate the min of the sum of a column with positive min value as that same 
> min value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21634) Materialized view rewriting over aggregate operators containing with grouping sets

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21634?focusedWorklogId=230821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230821
 ]

ASF GitHub Bot logged work on HIVE-21634:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 18:26
Start Date: 22/Apr/19 18:26
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #602: HIVE-21634
URL: https://github.com/apache/hive/pull/602
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230821)
Time Spent: 10m
Remaining Estimate: 0h

> Materialized view rewriting over aggregate operators containing with grouping 
> sets
> --
>
> Key: HIVE-21634
> URL: https://issues.apache.org/jira/browse/HIVE-21634
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21634.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A possible approach to support rewriting queries with an aggregate with 
> grouping sets is implementing a rule that splits the aggregate in the query 
> into an aggregate without grouping sets (bottom) and an aggregate with 
> grouping sets (top). Then the materialized view rewriting rule will trigger 
> on the former.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21634) Materialized view rewriting over aggregate operators containing with grouping sets

2019-04-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823322#comment-16823322
 ] 

Jesus Camacho Rodriguez commented on HIVE-21634:


[~ashutoshc], [~vgarg], can you review this PR?
https://github.com/apache/hive/pull/602

Thanks

> Materialized view rewriting over aggregate operators containing with grouping 
> sets
> --
>
> Key: HIVE-21634
> URL: https://issues.apache.org/jira/browse/HIVE-21634
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21634.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A possible approach to support rewriting queries with an aggregate with 
> grouping sets is implementing a rule that splits the aggregate in the query 
> into an aggregate without grouping sets (bottom) and an aggregate with 
> grouping sets (top). Then the materialized view rewriting rule will trigger 
> on the former.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21634) Materialized view rewriting over aggregate operators containing with grouping sets

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21634:
--
Labels: pull-request-available  (was: )

> Materialized view rewriting over aggregate operators containing with grouping 
> sets
> --
>
> Key: HIVE-21634
> URL: https://issues.apache.org/jira/browse/HIVE-21634
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21634.patch
>
>
> A possible approach to support rewriting queries with an aggregate with 
> grouping sets is implementing a rule that splits the aggregate in the query 
> into an aggregate without grouping sets (bottom) and an aggregate with 
> grouping sets (top). Then the materialized view rewriting rule will trigger 
> on the former.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823238#comment-16823238
 ] 

Gopal V commented on HIVE-21641:


[~ShubhamChaurasia]: is there some way to add a inline test for this?

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21641.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21637) Synchronized metastore cache

2019-04-22 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822643#comment-16822643
 ] 

Daniel Dai edited comment on HIVE-21637 at 4/22/19 4:44 PM:


HIVE-21637-1.patch is not the full patch. It only implemented several flow:
1. create table
2. describe table
3. commit transaction
4. prewarm
5. event based cache update


was (Author: daijy):
HIVE-21637-1.patch is not the full patch. It only implemented several flow:
1. create table
2. get table
3. commit transaction
4. prewarm
5. event based cache update

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823196#comment-16823196
 ] 

Hive QA commented on HIVE-21240:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12966630/HIVE-21240.12.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 15833 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreEventListenerWithOldConf - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestRemoteHiveMetaStoreZKBindHost - did not produce a TEST-*.xml file (likely 
timed out) (batchId=231)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17005/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17005/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17005/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12966630 - PreCommit-HIVE-Build

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21612) Upgrade druid to 0.14.0-incubating

2019-04-22 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-21612:

Status: Patch Available  (was: Open)

> Upgrade druid to 0.14.0-incubating
> --
>
> Key: HIVE-21612
> URL: https://issues.apache.org/jira/browse/HIVE-21612
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21612.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Druid 0.14.0-incubating is released. 
> This task is to upgrade hive to use 0.14.0-incubating version of druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21612) Upgrade druid to 0.14.0-incubating

2019-04-22 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-21612:

Status: Open  (was: Patch Available)

> Upgrade druid to 0.14.0-incubating
> --
>
> Key: HIVE-21612
> URL: https://issues.apache.org/jira/browse/HIVE-21612
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21612.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Druid 0.14.0-incubating is released. 
> This task is to upgrade hive to use 0.14.0-incubating version of druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823175#comment-16823175
 ] 

Hive QA commented on HIVE-21240:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} ql in master has 2256 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} hcatalog/core in master has 28 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} serde: The patch generated 0 new + 4 unchanged - 25 
fixed = 4 total (was 29) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 6 unchanged - 5 
fixed = 6 total (was 11) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} serde generated 0 new + 193 unchanged - 4 fixed = 
193 total (was 197) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
26s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17005/dev-support/hive-personality.sh
 |
| git revision | master / b58d50c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: serde ql hcatalog/core U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17005/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
> 

[jira] [Commented] (HIVE-21570) Convert llap iomem servlets output to json format

2019-04-22 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823144#comment-16823144
 ] 

slim bouguerra commented on HIVE-21570:
---

[~asinkovits] Thanks for doing this and sorry for late response. Can you please 
submit a Pull request where i can add comments.
 Here is 3 high level comments
 * There is no need to wrap numeric values with StringValueOf, the json writer 
can handle that plus it keep the typing correct.
 * Lot of the HashMap creation can be avoided by using EnumMaps (maybe) since 
we know all the keys upfront ?
 * Missing some documentation about the query parameter and probably some 
testing can you please provide examples and how might be used ?

Thanks

> Convert llap iomem servlets output to json format
> -
>
> Key: HIVE-21570
> URL: https://issues.apache.org/jira/browse/HIVE-21570
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Minor
> Attachments: HIVE-21570.01.patch, HIVE-21570.02.patch, 
> HIVE-21570.03.patch, HIVE-21570.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Status: Patch Available  (was: Open)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Attachment: HIVE-21240.12.patch

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Attachment: (was: HIVE-21240.11.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Status: Open  (was: Patch Available)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.1.1, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.12.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Attachment: (was: HIVE-21240.11.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.11.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread David Mollitor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21240:
--
Attachment: (was: HIVE-21240.11.patch)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.11.patch, 
> HIVE-21240.2.patch, HIVE-21240.3.patch, HIVE-21240.4.patch, 
> HIVE-21240.5.patch, HIVE-21240.6.patch, HIVE-21240.7.patch, 
> HIVE-21240.9.patch, HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823127#comment-16823127
 ] 

Hive QA commented on HIVE-21641:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12966620/HIVE-21641.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 15823 tests 
executed
*Failed tests:*
{noformat}
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestMetaStoreEventListenerOnlyOnCommit - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreEventListenerWithOldConf - did not produce a TEST-*.xml file 
(likely timed out) (batchId=231)
TestMetaStoreInitListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreListenersError - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestMetaStoreSchemaInfo - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestObjectStoreSchemaMethods - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestRemoteHiveMetaStoreZKBindHost - did not produce a TEST-*.xml file (likely 
timed out) (batchId=231)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=87)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[2]
 (batchId=210)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17004/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17004/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17004/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12966620 - PreCommit-HIVE-Build

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21641.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing 

[jira] [Work logged] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?focusedWorklogId=230711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230711
 ]

ASF GitHub Bot logged work on HIVE-21240:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 13:59
Start Date: 22/Apr/19 13:59
Worklog Time Spent: 10m 
  Work Description: BELUGABEHR commented on pull request #530: HIVE-21240: 
JSON SerDe Deserialize Re-Write
URL: https://github.com/apache/hive/pull/530#discussion_r277298161
 
 

 ##
 File path: 
serde/src/java/org/apache/hadoop/hive/serde2/json/HiveJsonReader.java
 ##
 @@ -0,0 +1,532 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.serde2.json;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.time.ZoneId;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.EnumSet;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.Set;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+import org.apache.commons.lang3.tuple.ImmutablePair;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.hadoop.hive.common.type.Date;
+import org.apache.hadoop.hive.common.type.HiveChar;
+import org.apache.hadoop.hive.common.type.HiveDecimal;
+import org.apache.hadoop.hive.common.type.HiveVarchar;
+import org.apache.hadoop.hive.common.type.Timestamp;
+import org.apache.hadoop.hive.common.type.TimestampTZ;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.MapObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TimestampLocalTZTypeInfo;
+import org.apache.hive.common.util.TimestampParser;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.databind.node.JsonNodeType;
+import com.fasterxml.jackson.databind.node.TextNode;
+import com.google.common.base.Preconditions;
+
+/**
+ * This class converts JSON strings into Java or Hive Primitive objects.
+ *
+ * Support types are:
+ * 
+ * 
+ * 
+ * JSON Type
+ * Java Type
+ * Notes
+ * 
+ * 
+ * Object
+ * java.util.List
+ * Each element may be different type
+ * 
+ * 
+ * Array
+ * java.util.List
+ * Each element is same type
+ * 
+ * 
+ * Map
+ * java.util.Map
+ * Keys must be same primitive type; every value is the same type
+ * 
+ * 
+ */
+public class HiveJsonReader {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(HiveJsonReader.class);
+
+  private final Map, StructField> 
discoveredFields =
+  new HashMap<>();
+
+  private final Set> 
discoveredUnknownFields =
+  new HashSet<>();
+
+  private final EnumSet features = EnumSet.noneOf(Feature.class);
+
+  private final ObjectMapper objectMapper;
+
+  private final TimestampParser tsParser;
+  private BinaryEncoding binaryEncoding;
+  private final ObjectInspector oi;
+
+  /**
+   * Enumeration that defines all on/off features for this reader.
+   */
+  public enum Feature {
+COL_INDEX_PARSING, PRIMITIVE_TO_WRITABLE, IGNORE_UKNOWN_FIELDS
+  }
+
+  /**
+   * Constructor with default the Hive default timestamp parser.
+   *
+   * @param oi ObjectInspector for all the fields in the JSON object
+   */
+  public 

[jira] [Work logged] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?focusedWorklogId=230702=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230702
 ]

ASF GitHub Bot logged work on HIVE-21240:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 13:54
Start Date: 22/Apr/19 13:54
Worklog Time Spent: 10m 
  Work Description: BELUGABEHR commented on pull request #530: HIVE-21240: 
JSON SerDe Deserialize Re-Write
URL: https://github.com/apache/hive/pull/530#discussion_r277296726
 
 

 ##
 File path: 
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFJsonRead.java
 ##
 @@ -156,10 +158,8 @@ public void testUndeclaredStructField() throws Exception {
   ObjectInspector[] arguments = buildArguments("struct");
   udf.initialize(arguments);
 
-  Object res = udf.evaluate(evalArgs("{\"b\":null}"));
-  assertTrue(res instanceof Object[]);
-  Object o[] = (Object[]) res;
-  assertEquals(null, o[0]);
+  // Invalid - should throw Exception
+  udf.evaluate(evalArgs("{\"b\":null}"));
 
 Review comment:
   Good question.
   
   If you checkout the test method, it is annotated with `@Test(expected = 
HiveException.class)`.   This means that the test will fail if it does not 
throw a HiveException.   What causes it to throw the Exception is the call to 
`udf.evaluate(evalArgs("{\"b\":null}"))`  therefore, everything that comes 
after it is dead code, so I simply removed it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230702)
Time Spent: 2h  (was: 1h 50m)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.11.patch, 
> HIVE-21240.11.patch, HIVE-21240.11.patch, HIVE-21240.2.patch, 
> HIVE-21240.3.patch, HIVE-21240.4.patch, HIVE-21240.5.patch, 
> HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-21240.9.patch, 
> HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?focusedWorklogId=230699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230699
 ]

ASF GitHub Bot logged work on HIVE-21240:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 13:49
Start Date: 22/Apr/19 13:49
Worklog Time Spent: 10m 
  Work Description: BELUGABEHR commented on pull request #530: HIVE-21240: 
JSON SerDe Deserialize Re-Write
URL: https://github.com/apache/hive/pull/530#discussion_r277295408
 
 

 ##
 File path: serde/src/java/org/apache/hadoop/hive/serde2/JsonSerDe.java
 ##
 @@ -63,76 +43,151 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-@SerDeSpec(schemaProps = {serdeConstants.LIST_COLUMNS,
-serdeConstants.LIST_COLUMN_TYPES,
-serdeConstants.TIMESTAMP_FORMATS })
-
+/**
+ * Hive SerDe for processing JSON formatted data. This is typically paired with
+ * the TextInputFormat and therefore each line provided to this SerDe must be a
+ * single, and complete JSON object.
+ * Example
+ * 
+ * {"name="john","age"=30}
+ * {"name="sue","age"=32}
+ * 
+ */
+@SerDeSpec(schemaProps = { serdeConstants.LIST_COLUMNS,
+serdeConstants.LIST_COLUMN_TYPES, serdeConstants.TIMESTAMP_FORMATS,
+JsonSerDe.BINARY_FORMAT, JsonSerDe.IGNORE_EXTRA })
 public class JsonSerDe extends AbstractSerDe {
 
   private static final Logger LOG = LoggerFactory.getLogger(JsonSerDe.class);
+
+  public static final String BINARY_FORMAT = "json.binary.format";
+  public static final String IGNORE_EXTRA = "text.ignore.extra.fields";
+  public static final String NULL_EMPTY_LINES = "text.null.empty.line";
+
   private List columnNames;
 
-  private HiveJsonStructReader structReader;
+  private BinaryEncoding binaryEncoding;
+  private boolean nullEmptyLines;
+
+  private HiveJsonReader jsonReader;
+  private HiveJsonWriter jsonWriter;
   private StructTypeInfo rowTypeInfo;
+  private StructObjectInspector soi;
 
+  /**
+   * Initialize the SerDe. By default, items being deserialized are expected to
+   * be wrapped in Hadoop Writable objects and objects being serialized are
+   * expected to be Java primitive objects.
+   */
   @Override
-  public void initialize(Configuration conf, Properties tbl)
-throws SerDeException {
-List columnTypes;
+  public void initialize(final Configuration conf, final Properties tbl)
+  throws SerDeException {
+initialize(conf, tbl, true);
+  }
+
+  /**
+   * Initialize the SerDe.
+   *
+   * @param conf System properties; can be null in compile time
+   * @param tbl table properties
+   * @param writeablePrimitivesDeserialize true if outputs are Hadoop Writable
+   */
+  public void initialize(final Configuration conf, final Properties tbl,
+  final boolean writeablePrimitivesDeserialize) {
+
 LOG.debug("Initializing JsonSerDe: {}", tbl.entrySet());
 
 // Get column names
-String columnNameProperty = tbl.getProperty(serdeConstants.LIST_COLUMNS);
-final String columnNameDelimiter = 
tbl.containsKey(serdeConstants.COLUMN_NAME_DELIMITER) ? tbl
-.getProperty(serdeConstants.COLUMN_NAME_DELIMITER)
-  : String.valueOf(SerDeUtils.COMMA);
-// all table column names
-if (columnNameProperty.isEmpty()) {
-  columnNames = Collections.emptyList();
-} else {
-  columnNames = 
Arrays.asList(columnNameProperty.split(columnNameDelimiter));
-}
+final String columnNameProperty =
+tbl.getProperty(serdeConstants.LIST_COLUMNS);
+final String columnNameDelimiter = tbl.getProperty(
+serdeConstants.COLUMN_NAME_DELIMITER, 
String.valueOf(SerDeUtils.COMMA));
+
+this.columnNames = columnNameProperty.isEmpty() ? Collections.emptyList()
+: Arrays.asList(columnNameProperty.split(columnNameDelimiter));
 
 // all column types
-String columnTypeProperty = 
tbl.getProperty(serdeConstants.LIST_COLUMN_TYPES);
-if (columnTypeProperty.isEmpty()) {
-  columnTypes = Collections.emptyList();
-} else {
-  columnTypes = 
TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
-}
+final String columnTypeProperty =
+tbl.getProperty(serdeConstants.LIST_COLUMN_TYPES);
+
+final List columnTypes =
+columnTypeProperty.isEmpty() ? Collections.emptyList()
+: TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
 
 LOG.debug("columns: {}, {}", columnNameProperty, columnNames);
 LOG.debug("types: {}, {} ", columnTypeProperty, columnTypes);
 
 assert (columnNames.size() == columnTypes.size());
 
-rowTypeInfo = (StructTypeInfo) 
TypeInfoFactory.getStructTypeInfo(columnNames, columnTypes);
+final String nullEmpty = tbl.getProperty(NULL_EMPTY_LINES, "false");
+this.nullEmptyLines = Boolean.parseBoolean(nullEmpty);
+
+this.rowTypeInfo = (StructTypeInfo) TypeInfoFactory
+.getStructTypeInfo(columnNames, columnTypes);
+
+this.soi 

[jira] [Work logged] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?focusedWorklogId=230698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230698
 ]

ASF GitHub Bot logged work on HIVE-21240:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 13:46
Start Date: 22/Apr/19 13:46
Worklog Time Spent: 10m 
  Work Description: BELUGABEHR commented on pull request #530: HIVE-21240: 
JSON SerDe Deserialize Re-Write
URL: https://github.com/apache/hive/pull/530#discussion_r277294622
 
 

 ##
 File path: serde/src/java/org/apache/hadoop/hive/serde2/JsonSerDe.java
 ##
 @@ -63,76 +43,151 @@
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-@SerDeSpec(schemaProps = {serdeConstants.LIST_COLUMNS,
-serdeConstants.LIST_COLUMN_TYPES,
-serdeConstants.TIMESTAMP_FORMATS })
-
+/**
+ * Hive SerDe for processing JSON formatted data. This is typically paired with
+ * the TextInputFormat and therefore each line provided to this SerDe must be a
+ * single, and complete JSON object.
+ * Example
+ * 
+ * {"name="john","age"=30}
+ * {"name="sue","age"=32}
+ * 
+ */
+@SerDeSpec(schemaProps = { serdeConstants.LIST_COLUMNS,
+serdeConstants.LIST_COLUMN_TYPES, serdeConstants.TIMESTAMP_FORMATS,
+JsonSerDe.BINARY_FORMAT, JsonSerDe.IGNORE_EXTRA })
 public class JsonSerDe extends AbstractSerDe {
 
   private static final Logger LOG = LoggerFactory.getLogger(JsonSerDe.class);
+
+  public static final String BINARY_FORMAT = "json.binary.format";
+  public static final String IGNORE_EXTRA = "text.ignore.extra.fields";
+  public static final String NULL_EMPTY_LINES = "text.null.empty.line";
+
   private List columnNames;
 
-  private HiveJsonStructReader structReader;
+  private BinaryEncoding binaryEncoding;
+  private boolean nullEmptyLines;
+
+  private HiveJsonReader jsonReader;
+  private HiveJsonWriter jsonWriter;
   private StructTypeInfo rowTypeInfo;
+  private StructObjectInspector soi;
 
+  /**
+   * Initialize the SerDe. By default, items being deserialized are expected to
+   * be wrapped in Hadoop Writable objects and objects being serialized are
+   * expected to be Java primitive objects.
+   */
   @Override
-  public void initialize(Configuration conf, Properties tbl)
-throws SerDeException {
-List columnTypes;
+  public void initialize(final Configuration conf, final Properties tbl)
+  throws SerDeException {
+initialize(conf, tbl, true);
+  }
+
+  /**
+   * Initialize the SerDe.
+   *
+   * @param conf System properties; can be null in compile time
+   * @param tbl table properties
+   * @param writeablePrimitivesDeserialize true if outputs are Hadoop Writable
+   */
+  public void initialize(final Configuration conf, final Properties tbl,
+  final boolean writeablePrimitivesDeserialize) {
+
 LOG.debug("Initializing JsonSerDe: {}", tbl.entrySet());
 
 // Get column names
-String columnNameProperty = tbl.getProperty(serdeConstants.LIST_COLUMNS);
-final String columnNameDelimiter = 
tbl.containsKey(serdeConstants.COLUMN_NAME_DELIMITER) ? tbl
-.getProperty(serdeConstants.COLUMN_NAME_DELIMITER)
-  : String.valueOf(SerDeUtils.COMMA);
-// all table column names
-if (columnNameProperty.isEmpty()) {
-  columnNames = Collections.emptyList();
-} else {
-  columnNames = 
Arrays.asList(columnNameProperty.split(columnNameDelimiter));
-}
+final String columnNameProperty =
+tbl.getProperty(serdeConstants.LIST_COLUMNS);
+final String columnNameDelimiter = tbl.getProperty(
+serdeConstants.COLUMN_NAME_DELIMITER, 
String.valueOf(SerDeUtils.COMMA));
+
+this.columnNames = columnNameProperty.isEmpty() ? Collections.emptyList()
+: Arrays.asList(columnNameProperty.split(columnNameDelimiter));
 
 // all column types
-String columnTypeProperty = 
tbl.getProperty(serdeConstants.LIST_COLUMN_TYPES);
-if (columnTypeProperty.isEmpty()) {
-  columnTypes = Collections.emptyList();
-} else {
-  columnTypes = 
TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
-}
+final String columnTypeProperty =
+tbl.getProperty(serdeConstants.LIST_COLUMN_TYPES);
+
+final List columnTypes =
+columnTypeProperty.isEmpty() ? Collections.emptyList()
+: TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
 
 LOG.debug("columns: {}, {}", columnNameProperty, columnNames);
 LOG.debug("types: {}, {} ", columnTypeProperty, columnTypes);
 
 assert (columnNames.size() == columnTypes.size());
 
-rowTypeInfo = (StructTypeInfo) 
TypeInfoFactory.getStructTypeInfo(columnNames, columnTypes);
+final String nullEmpty = tbl.getProperty(NULL_EMPTY_LINES, "false");
+this.nullEmptyLines = Boolean.parseBoolean(nullEmpty);
+
+this.rowTypeInfo = (StructTypeInfo) TypeInfoFactory
+.getStructTypeInfo(columnNames, columnTypes);
+
+this.soi 

[jira] [Work logged] (HIVE-21240) JSON SerDe Re-Write

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21240?focusedWorklogId=230697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230697
 ]

ASF GitHub Bot logged work on HIVE-21240:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 13:45
Start Date: 22/Apr/19 13:45
Worklog Time Spent: 10m 
  Work Description: BELUGABEHR commented on pull request #530: HIVE-21240: 
JSON SerDe Deserialize Re-Write
URL: https://github.com/apache/hive/pull/530#discussion_r277294129
 
 

 ##
 File path: serde/src/java/org/apache/hadoop/hive/serde2/JsonSerDe.java
 ##
 @@ -142,227 +197,21 @@ public Object deserialize(Writable blob) throws 
SerDeException {
* and generate a Text representation of the object.
*/
   @Override
-  public Writable serialize(Object obj, ObjectInspector objInspector)
-throws SerDeException {
-StringBuilder sb = new StringBuilder();
-try {
+  public Writable serialize(final Object obj,
 
 Review comment:
   I just checked this... this is the correct format when using the Hadoop 
check-style format.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230697)
Time Spent: 1.5h  (was: 1h 20m)

> JSON SerDe Re-Write
> ---
>
> Key: HIVE-21240
> URL: https://issues.apache.org/jira/browse/HIVE-21240
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0, 3.1.1
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.11.patch, HIVE-21240.11.patch, 
> HIVE-21240.11.patch, HIVE-21240.11.patch, HIVE-21240.2.patch, 
> HIVE-21240.3.patch, HIVE-21240.4.patch, HIVE-21240.5.patch, 
> HIVE-21240.6.patch, HIVE-21240.7.patch, HIVE-21240.9.patch, 
> HIVE-24240.8.patch, kafka_storage_handler.diff
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823101#comment-16823101
 ] 

Hive QA commented on HIVE-21641:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2256 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17004/dev-support/hive-personality.sh
 |
| git revision | master / b58d50c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17004/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21641.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> 

[jira] [Updated] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Shubham Chaurasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-21641:
-
Attachment: HIVE-21641.1.patch
Status: Patch Available  (was: Open)

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21641.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21641?focusedWorklogId=230676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230676
 ]

ASF GitHub Bot logged work on HIVE-21641:
-

Author: ASF GitHub Bot
Created on: 22/Apr/19 12:36
Start Date: 22/Apr/19 12:36
Worklog Time Spent: 10m 
  Work Description: ShubhamChaurasia commented on pull request #601: 
HIVE-21641: Llap external client returns decimal precision bug
URL: https://github.com/apache/hive/pull/601
 
 
   Llap external client returns decimal columns in different precision/scale as 
compared to beeline
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230676)
Time Spent: 10m
Remaining Estimate: 0h

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21641:
--
Labels: pull-request-available  (was: )

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Shubham Chaurasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-21641:
-
Fix Version/s: 4.0.0

> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
> Fix For: 4.0.0
>
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21641) Llap external client returns decimal columns in different precision/scale as compared to beeline

2019-04-22 Thread Shubham Chaurasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia reassigned HIVE-21641:



> Llap external client returns decimal columns in different precision/scale as 
> compared to beeline
> 
>
> Key: HIVE-21641
> URL: https://issues.apache.org/jira/browse/HIVE-21641
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>
> Llap external client gives different precision/scale as compared to when the 
> query is executed beeline. Consider the following results:
> Query:
> {code} 
> select avg(ss_ext_sales_price) my_avg from store_sales;
> {code} 
> Result from Beeline
> {code} 
> ++
> |   my_avg   |
> ++
> | 37.8923531030581611189434  |
> ++
> {code} 
> Result from Llap external client
> {code}
> +-+
> |   my_avg|
> +-+
> |37.892353|
> +-+
> {code}
>  
> This is due to Driver(beeline path) calls 
> [analyzeInternal()|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L328]
>  for getting result set schema which initializes 
> [resultSchema|https://github.com/apache/hive/blob/rel/release-3.1.1/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L333]
>  after some more transformations as compared to llap-ext-client which calls 
> [genLogicalPlan()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L561]
> Replacing {{genLogicalPlan()}} by {{analyze()}} resolves this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)