[jira] [Commented] (HIVE-13567) Enable auto-gather column stats by default

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264985#comment-16264985
 ] 

Hive QA commented on HIVE-13567:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore: The patch generated 3 new + 582 
unchanged - 2 fixed = 585 total (was 584) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
39s{color} | {color:red} root: The patch generated 3 new + 1531 unchanged - 2 
fixed = 1534 total (was 1533) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7992/yetus/diff-checkstyle-standalone-metastore.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7992/yetus/diff-checkstyle-root.txt
 |
| modules | C: common standalone-metastore ql accumulo-handler contrib 
hbase-handler . itests/hive-blobstore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7992/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enable auto-gather column stats by default
> --
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, 
> HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, 
> HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, 
> HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, 
> HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, 
> HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch, 
> HIVE-13567.18.patch, HIVE-13567.19.patch, HIVE-13567.20.patch, 
> HIVE-13567.21.patch, HIVE-13567.22.patch, HIVE-13567.23wip01.patch, 
> 

[jira] [Updated] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-11-23 Thread Harish Jaiprakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Jaiprakash updated HIVE-17954:
-
Description: 
Implement the following commands:

-- Pool management.
CREATE POOL `resource_plan`.`pool_path` WITH
  ALLOC_FRACTION=`fraction`,
  QUERY_PARALLELISM=`parallelism`,
  SCHEDULING_POLICY=`policy`;

ALTER POOL `resource_plan`.`pool_path` SET
  PATH = `new_path`,
  ALLOC_FRACTION = `fraction`,
  QUERY_PARALLELISM = `parallelism`,
  SCHEDULING_POLICY = `policy`;

DROP POOL `resource_plan`.`pool_path`;

-- Adding triggers to pools.
ALTER POOL `resource_plan`.`pool_path` ADD TRIGGER `trigger_name`;

ALTER POOL `resource_plan`.`pool_path` DROP TRIGGER `trigger_name`;

-- User/Group to pool mappings.
CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
  TO `pool_path` WITH ORDERING `order_no`;

DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;

  was:
Implement the following commands:

-- Pool management.
CREATE POOL `resource_plan`.`pool_path` WITH
  ALLOC_FRACTION `fraction`
  QUERY_PARALLELISM `parallelism`
  SCHEDULING_POLICY `policy`;

ALTER POOL `resource_plan`.`pool_path` SET
  PATH = `new_path`,
  ALLOC_FRACTION = `fraction`,
  QUERY_PARALLELISM = `parallelism`,
  SCHEDULING_POLICY = `policy`;

DROP POOL `resource_plan`.`pool_path`;

-- Trigger to pool mappings.
ALTER RESOURCE PLAN `resource_plan`
  ADD TRIGGER `trigger_name` TO `pool_path`;

ALTER RESOURCE PLAN `resource_plan`
  DROP TRIGGER `trigger_name` TO `pool_path`;

-- User/Group to pool mappings.
CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
  TO `pool_path` WITH ORDERING `order_no`;

DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch, 
> HIVE-17954.03.patch, HIVE-17954.04.patch, HIVE-17954.05.patch, 
> HIVE-17954.06.patch, HIVE-17954.07.patch, HIVE-17954.08.patch, 
> HIVE-17954.09.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION=`fraction`,
>   QUERY_PARALLELISM=`parallelism`,
>   SCHEDULING_POLICY=`policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Adding triggers to pools.
> ALTER POOL `resource_plan`.`pool_path` ADD TRIGGER `trigger_name`;
> ALTER POOL `resource_plan`.`pool_path` DROP TRIGGER `trigger_name`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264901#comment-16264901
 ] 

Gopal V commented on HIVE-17994:


The TypeInfo is a singleton object (from the the factory object), so it looks 
like I've got class-loader sharing race conditions with this (i.e thread 1 
class loader showing up for thread 2).

Is there some way to restrict this patch to just vectorization code on the task 
side (like holding the PrimitiveType[] in the initialize in the operator?) and 
not accidentally kick-in because the HS2 planner hasn't got locks (& separate 
class loaders for each query).

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-13665) HS2 memory leak When multiple queries are running with get_json_object

2017-11-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

靳峥 updated HIVE-13665:
--
Affects Version/s: (was: 2.0.0)
   2.2.0
Fix Version/s: 2.3.0
  Component/s: UDF

> HS2 memory leak When multiple queries are running with get_json_object
> --
>
> Key: HIVE-13665
> URL: https://issues.apache.org/jira/browse/HIVE-13665
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.1.0, 2.2.0
>Reporter: JinsuKim
>Assignee: 靳峥
> Fix For: 2.3.0
>
> Attachments: patch.lst.txt
>
>
> The extractObjectCache in UDFJson is increased over limitation(CACHE_SIZE = 
> 16). When multiple queries are running concurrently on HS2 local(not mr/tez) 
> with get_json_object or get_json_tuple
> {code:java|title=HS2 heap_dump}
> Object at 0x515ab18f8
> instance of org.apache.hadoop.hive.ql.udf.UDFJson$HashCache@0x515ab18f8 (77 
> bytes)
> Class:
> class org.apache.hadoop.hive.ql.udf.UDFJson$HashCache
> Instance data members:
> accessOrder (Z) : false
> entrySet (L) : 
> hashSeed (I) : 0
> header (L) : java.util.LinkedHashMap$Entry@0x515a577d0 (60 bytes) 
> keySet (L) : 
> loadFactor (F) : 0.6
> modCount (I) : 4741146
> size (I) : 2733158   <== here!!
> table (L) : [Ljava.util.HashMap$Entry;@0x7163d8b70 (67108880 bytes) 
> threshold (I) : 5033165
> values (L) : 
> References to this object:
> {code}
> I think that this problem be caused by the LinkedHashMap object is not 
> thread-safe
> {code}
> * Note that this implementation is not synchronized.
>  * If multiple threads access a linked hash map concurrently, and at least
>  * one of the threads modifies the map structurally, it must be
>  * synchronized externally.  This is typically accomplished by
>  * synchronizing on some object that naturally encapsulates the map.
> {code}
> Reproduce :
> # Multiple queries are running with get_json_object and small input data(for 
> execution on hs2 local mode)
> # jvm heap dump & analyze
> {code:title=test scenario}
> Multiple queries are running with get_json_object and small input data(for 
> execute on hs2 local mode)
> 1.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040105' 
> 2.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040106'
> 3.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040107'
> 4.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040108'
>  
> run.sh :
> t_cnt=0
> while true
> do
> echo "query executing..."
> for i in 1 2 3 4
> do
> beeline -u jdbc:hive2://localhost:1 -n hive --silent=true -f 
> $i.hql > $i.log 2>&1 &
> done
> wait
> t_cnt=`expr $t_cnt + 1`
> echo "query count : $t_cnt"
> sleep 2
> done
> jvm heap dump & analyze :
> jmap -dump:format=b,file=hive.dmp $PID
> jhat -J-mx48000m -port 8080 hive.dmp &
> {code}
> Finally I have attached our patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-13665) HS2 memory leak When multiple queries are running with get_json_object

2017-11-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264874#comment-16264874
 ] 

靳峥 edited comment on HIVE-13665 at 11/24/17 2:04 AM:
-

Already fixed by HIVE-16196, thanks Jürgen Thomann.


was (Author: jinzheng):
Already fixed by HIVE-16196

> HS2 memory leak When multiple queries are running with get_json_object
> --
>
> Key: HIVE-13665
> URL: https://issues.apache.org/jira/browse/HIVE-13665
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0
>Reporter: JinsuKim
>Assignee: 靳峥
> Attachments: patch.lst.txt
>
>
> The extractObjectCache in UDFJson is increased over limitation(CACHE_SIZE = 
> 16). When multiple queries are running concurrently on HS2 local(not mr/tez) 
> with get_json_object or get_json_tuple
> {code:java|title=HS2 heap_dump}
> Object at 0x515ab18f8
> instance of org.apache.hadoop.hive.ql.udf.UDFJson$HashCache@0x515ab18f8 (77 
> bytes)
> Class:
> class org.apache.hadoop.hive.ql.udf.UDFJson$HashCache
> Instance data members:
> accessOrder (Z) : false
> entrySet (L) : 
> hashSeed (I) : 0
> header (L) : java.util.LinkedHashMap$Entry@0x515a577d0 (60 bytes) 
> keySet (L) : 
> loadFactor (F) : 0.6
> modCount (I) : 4741146
> size (I) : 2733158   <== here!!
> table (L) : [Ljava.util.HashMap$Entry;@0x7163d8b70 (67108880 bytes) 
> threshold (I) : 5033165
> values (L) : 
> References to this object:
> {code}
> I think that this problem be caused by the LinkedHashMap object is not 
> thread-safe
> {code}
> * Note that this implementation is not synchronized.
>  * If multiple threads access a linked hash map concurrently, and at least
>  * one of the threads modifies the map structurally, it must be
>  * synchronized externally.  This is typically accomplished by
>  * synchronizing on some object that naturally encapsulates the map.
> {code}
> Reproduce :
> # Multiple queries are running with get_json_object and small input data(for 
> execution on hs2 local mode)
> # jvm heap dump & analyze
> {code:title=test scenario}
> Multiple queries are running with get_json_object and small input data(for 
> execute on hs2 local mode)
> 1.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040105' 
> 2.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040106'
> 3.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040107'
> 4.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040108'
>  
> run.sh :
> t_cnt=0
> while true
> do
> echo "query executing..."
> for i in 1 2 3 4
> do
> beeline -u jdbc:hive2://localhost:1 -n hive --silent=true -f 
> $i.hql > $i.log 2>&1 &
> done
> wait
> t_cnt=`expr $t_cnt + 1`
> echo "query count : $t_cnt"
> sleep 2
> done
> jvm heap dump & analyze :
> jmap -dump:format=b,file=hive.dmp $PID
> jhat -J-mx48000m -port 8080 hive.dmp &
> {code}
> Finally I have attached our patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-13665) HS2 memory leak When multiple queries are running with get_json_object

2017-11-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

靳峥 resolved HIVE-13665.
---
Resolution: Duplicate

Already fixed by HIVE-16196

> HS2 memory leak When multiple queries are running with get_json_object
> --
>
> Key: HIVE-13665
> URL: https://issues.apache.org/jira/browse/HIVE-13665
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0
>Reporter: JinsuKim
>Assignee: 靳峥
> Attachments: patch.lst.txt
>
>
> The extractObjectCache in UDFJson is increased over limitation(CACHE_SIZE = 
> 16). When multiple queries are running concurrently on HS2 local(not mr/tez) 
> with get_json_object or get_json_tuple
> {code:java|title=HS2 heap_dump}
> Object at 0x515ab18f8
> instance of org.apache.hadoop.hive.ql.udf.UDFJson$HashCache@0x515ab18f8 (77 
> bytes)
> Class:
> class org.apache.hadoop.hive.ql.udf.UDFJson$HashCache
> Instance data members:
> accessOrder (Z) : false
> entrySet (L) : 
> hashSeed (I) : 0
> header (L) : java.util.LinkedHashMap$Entry@0x515a577d0 (60 bytes) 
> keySet (L) : 
> loadFactor (F) : 0.6
> modCount (I) : 4741146
> size (I) : 2733158   <== here!!
> table (L) : [Ljava.util.HashMap$Entry;@0x7163d8b70 (67108880 bytes) 
> threshold (I) : 5033165
> values (L) : 
> References to this object:
> {code}
> I think that this problem be caused by the LinkedHashMap object is not 
> thread-safe
> {code}
> * Note that this implementation is not synchronized.
>  * If multiple threads access a linked hash map concurrently, and at least
>  * one of the threads modifies the map structurally, it must be
>  * synchronized externally.  This is typically accomplished by
>  * synchronizing on some object that naturally encapsulates the map.
> {code}
> Reproduce :
> # Multiple queries are running with get_json_object and small input data(for 
> execution on hs2 local mode)
> # jvm heap dump & analyze
> {code:title=test scenario}
> Multiple queries are running with get_json_object and small input data(for 
> execute on hs2 local mode)
> 1.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040105' 
> 2.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040106'
> 3.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040107'
> 4.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040108'
>  
> run.sh :
> t_cnt=0
> while true
> do
> echo "query executing..."
> for i in 1 2 3 4
> do
> beeline -u jdbc:hive2://localhost:1 -n hive --silent=true -f 
> $i.hql > $i.log 2>&1 &
> done
> wait
> t_cnt=`expr $t_cnt + 1`
> echo "query count : $t_cnt"
> sleep 2
> done
> jvm heap dump & analyze :
> jmap -dump:format=b,file=hive.dmp $PID
> jhat -J-mx48000m -port 8080 hive.dmp &
> {code}
> Finally I have attached our patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-13665) HS2 memory leak When multiple queries are running with get_json_object

2017-11-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

靳峥 reassigned HIVE-13665:
-

Assignee: 靳峥

> HS2 memory leak When multiple queries are running with get_json_object
> --
>
> Key: HIVE-13665
> URL: https://issues.apache.org/jira/browse/HIVE-13665
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.0
>Reporter: JinsuKim
>Assignee: 靳峥
> Attachments: patch.lst.txt
>
>
> The extractObjectCache in UDFJson is increased over limitation(CACHE_SIZE = 
> 16). When multiple queries are running concurrently on HS2 local(not mr/tez) 
> with get_json_object or get_json_tuple
> {code:java|title=HS2 heap_dump}
> Object at 0x515ab18f8
> instance of org.apache.hadoop.hive.ql.udf.UDFJson$HashCache@0x515ab18f8 (77 
> bytes)
> Class:
> class org.apache.hadoop.hive.ql.udf.UDFJson$HashCache
> Instance data members:
> accessOrder (Z) : false
> entrySet (L) : 
> hashSeed (I) : 0
> header (L) : java.util.LinkedHashMap$Entry@0x515a577d0 (60 bytes) 
> keySet (L) : 
> loadFactor (F) : 0.6
> modCount (I) : 4741146
> size (I) : 2733158   <== here!!
> table (L) : [Ljava.util.HashMap$Entry;@0x7163d8b70 (67108880 bytes) 
> threshold (I) : 5033165
> values (L) : 
> References to this object:
> {code}
> I think that this problem be caused by the LinkedHashMap object is not 
> thread-safe
> {code}
> * Note that this implementation is not synchronized.
>  * If multiple threads access a linked hash map concurrently, and at least
>  * one of the threads modifies the map structurally, it must be
>  * synchronized externally.  This is typically accomplished by
>  * synchronizing on some object that naturally encapsulates the map.
> {code}
> Reproduce :
> # Multiple queries are running with get_json_object and small input data(for 
> execution on hs2 local mode)
> # jvm heap dump & analyze
> {code:title=test scenario}
> Multiple queries are running with get_json_object and small input data(for 
> execute on hs2 local mode)
> 1.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040105' 
> 2.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040106'
> 3.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040107'
> 4.hql :
> SELECT get_json_object(body, '$.fileSize'), get_json_object(body, 
> '$.ps_totalTimeSeconds'), get_json_object(body, '$.totalTimeSeconds') FROM 
> xxx. WHERE part_hour='2016040108'
>  
> run.sh :
> t_cnt=0
> while true
> do
> echo "query executing..."
> for i in 1 2 3 4
> do
> beeline -u jdbc:hive2://localhost:1 -n hive --silent=true -f 
> $i.hql > $i.log 2>&1 &
> done
> wait
> t_cnt=`expr $t_cnt + 1`
> echo "query count : $t_cnt"
> sleep 2
> done
> jvm heap dump & analyze :
> jmap -dump:format=b,file=hive.dmp $PID
> jhat -J-mx48000m -port 8080 hive.dmp &
> {code}
> Finally I have attached our patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264854#comment-16264854
 ] 

Matt McCline commented on HIVE-17994:
-

Not clear any of the test failures are related.

TestTriggersWorkloadManager timed out but works on my laptop.
TestCliDriver windowing_range_multiorder.q succeeds on my laptop.

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264842#comment-16264842
 ] 

Hive QA commented on HIVE-17994:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899107/HIVE-17994.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11410 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_range_multiorder]
 (batchId=7)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query64] 
(batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=224)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=230)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers1 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7991/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7991/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7991/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12899107 - PreCommit-HIVE-Build

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264815#comment-16264815
 ] 

Hive QA commented on HIVE-17994:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  8m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| modules | C: serde U: serde |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7991/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: Patch Available  (was: In Progress)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Attachment: HIVE-17994.05.patch

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: In Progress  (was: Patch Available)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, HIVE-17994.05.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264771#comment-16264771
 ] 

Hive QA commented on HIVE-18140:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899061/HIVE-18140.01wip01.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11412 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl_dp] 
(batchId=50)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_reordering_no_stats]
 (batchId=162)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[stats8] 
(batchId=134)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=224)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=230)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7990/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7990/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7990/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12899061 - PreCommit-HIVE-Build

> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18140.01wip01.patch
>
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264747#comment-16264747
 ] 

Hive QA commented on HIVE-18140:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
32s{color} | {color:red} ql: The patch generated 15 new + 126 unchanged - 5 
fixed = 141 total (was 131) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
50s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7990/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7990/yetus/whitespace-eol.txt 
|
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7990/yetus/diff-javadoc-javadoc-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7990/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18140.01wip01.patch
>
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13567) Enable auto-gather column stats by default

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264739#comment-16264739
 ] 

Hive QA commented on HIVE-13567:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899036/HIVE-13567.23wip07.patch

{color:green}SUCCESS:{color} +1 due to 44 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 150 failed/errored test(s), 11411 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_6] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_explain] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3]
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join11] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_comments] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_date] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_partitioned] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table3] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin10] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin11] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin12] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin13] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin8] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin9] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_1]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_decimal] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constGby] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_5] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic1] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic3] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[equal_ns] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_evolved_parts] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extract] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_limit] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby4_noskew] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew_multi_single_reducer]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping]
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hook_order] 

[jira] [Resolved] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-18137.
-
Resolution: Invalid

Thank you [~ashutoshc] for the clarification, now I understand how this feature 
works :)
In this case I think there is no problem with it - so I close this as invalid.

> Schema evolution: newly inserted column value in pre-existing partition is 
> masked to null
> -
>
> Key: HIVE-18137
> URL: https://issues.apache.org/jira/browse/HIVE-18137
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> {code}
> set hive.explain.user=false;
> set hive.fetch.task.conversion=none;
> set hive.mapred.mode=nonstrict;
> set hive.cli.print.header=true;
> SET hive.exec.schema.evolution=true;
> SET hive.vectorized.use.vectorized.input.format=true;
> SET hive.vectorized.use.vector.serde.deserialize=false;
> SET hive.vectorized.use.row.serde.deserialize=false;
> SET hive.vectorized.execution.enabled=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.metastore.disallow.incompatible.col.type.changes=true;
> set hive.default.fileformat=textfile;
> set hive.llap.io.enabled=false;
> CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
> PARTITIONED BY(part INT);
> insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
> , 'new');
> alter table part_add_int_permute_select add columns(c int);
> insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
> , 'new', );
> select insert_num,part,a,b,c from part_add_int_permute_select;
> {code}
> results for the last select:
> {code}
> 1  1   new NULL
> 2  1   new NULL
> {code}
> I think the following result should be expected:
> {code}
> 1  1   new NULL
> 2  1   new 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18141:

Attachment: HIVE-18141.02.patch

after submitting the patch; I've kept thinking - I suspected that something 
will be wrong ; so I've added more cases ; and I've found a different issue; so 
#2 should be good/better :) 

> Fix StatsUtils.combineRange to combine intervals
> 
>
> Key: HIVE-18141
> URL: https://issues.apache.org/jira/browse/HIVE-18141
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18141.01.patch, HIVE-18141.02.patch
>
>
> the current [combinedRange 
> implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
>  in its current form "combines" only ranges which contain eachother
> but the comments suggests that the intention was to capture the case when the 
> 2 intervals are overlap; can be checked with the following testcase:
> {code}
>   @Test
>   public void test11() {
> Range r1 = new Range(0, 1);
> Range r2 = new Range(1, 11);
> Range r3 = StatsUtils.combineRange(r1, r2);
> assertNotNull(r3);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264656#comment-16264656
 ] 

Ashutosh Chauhan commented on HIVE-18137:
-

Following rules suppose to be followed for schema evolution. 
* Partitions when they are created get their schema as current table schema.
* There is no way to alter partition schema.
* Except via {{cascade}} which is suppose to alter schema of all partitions so 
that they get same schema as current table schema.
* At query time, data is read per schema of table. Partitions will be read with 
their own schema and then coerced into table schema.

Keeping in mind above, in your example since partition schema is not altered, 
partition will be read per its old schema which means even if you insert new 
columns, since partition schema doesn't know about it, new columns will be 
ignored while reading partition. But since table schema contains it, we will 
add NULL for it after partition has been read and while coercing it to match 
table schema. So, current behavior will be considered correct.
On the other hand if you have altered table schema using {{cascade}} then 
existing partition schema will also be updated and then partition will be read 
per this new schema so new column will be read and result set will be as per 
your second result set with one row with null and other with .

Now this is how it *suppose* to work but since we have different code paths for 
self describing file formats like orc vs others like text if you get different 
behavior in some corner cases that will be considered bug.

> Schema evolution: newly inserted column value in pre-existing partition is 
> masked to null
> -
>
> Key: HIVE-18137
> URL: https://issues.apache.org/jira/browse/HIVE-18137
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> {code}
> set hive.explain.user=false;
> set hive.fetch.task.conversion=none;
> set hive.mapred.mode=nonstrict;
> set hive.cli.print.header=true;
> SET hive.exec.schema.evolution=true;
> SET hive.vectorized.use.vectorized.input.format=true;
> SET hive.vectorized.use.vector.serde.deserialize=false;
> SET hive.vectorized.use.row.serde.deserialize=false;
> SET hive.vectorized.execution.enabled=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.metastore.disallow.incompatible.col.type.changes=true;
> set hive.default.fileformat=textfile;
> set hive.llap.io.enabled=false;
> CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
> PARTITIONED BY(part INT);
> insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
> , 'new');
> alter table part_add_int_permute_select add columns(c int);
> insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
> , 'new', );
> select insert_num,part,a,b,c from part_add_int_permute_select;
> {code}
> results for the last select:
> {code}
> 1  1   new NULL
> 2  1   new NULL
> {code}
> I think the following result should be expected:
> {code}
> 1  1   new NULL
> 2  1   new 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the selected columns size

2017-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264640#comment-16264640
 ] 

Ashutosh Chauhan commented on HIVE-18108:
-

+1

> in case basic stats are missing; rowcount estimation depends on the selected 
> columns size
> -
>
> Key: HIVE-18108
> URL: https://issues.apache.org/jira/browse/HIVE-18108
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18108.01.patch
>
>
> in case basicstats are not available (especially rowcount):
> {code}
> set hive.stats.autogather=false;
> create table t (a integer, b string);
> insert into t values (1,'asd1');
> insert into t values (2,'asd2');
> insert into t values (3,'asd3');
> insert into t values (4,'asd4');
> insert into t values (5,'asd5');
> explain select a,count(1) from t group by a;
> -- estimated to read 8 rows from table t
> explain select b,count(1) from t group by b;
> -- estimated: 1 rows
> explain select a,b,count(1) from t group by a,b;
> -- estimated: 1 rows
> {code}
> it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264614#comment-16264614
 ] 

Ashutosh Chauhan commented on HIVE-18141:
-

+1

> Fix StatsUtils.combineRange to combine intervals
> 
>
> Key: HIVE-18141
> URL: https://issues.apache.org/jira/browse/HIVE-18141
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18141.01.patch
>
>
> the current [combinedRange 
> implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
>  in its current form "combines" only ranges which contain eachother
> but the comments suggests that the intention was to capture the case when the 
> 2 intervals are overlap; can be checked with the following testcase:
> {code}
>   @Test
>   public void test11() {
> Range r1 = new Range(0, 1);
> Range r2 = new Range(1, 11);
> Range r3 = StatsUtils.combineRange(r1, r2);
> assertNotNull(r3);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18141:

Status: Patch Available  (was: Open)

> Fix StatsUtils.combineRange to combine intervals
> 
>
> Key: HIVE-18141
> URL: https://issues.apache.org/jira/browse/HIVE-18141
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18141.01.patch
>
>
> the current [combinedRange 
> implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
>  in its current form "combines" only ranges which contain eachother
> but the comments suggests that the intention was to capture the case when the 
> 2 intervals are overlap; can be checked with the following testcase:
> {code}
>   @Test
>   public void test11() {
> Range r1 = new Range(0, 1);
> Range r2 = new Range(1, 11);
> Range r3 = StatsUtils.combineRange(r1, r2);
> assertNotNull(r3);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18141:

Attachment: HIVE-18141.01.patch

#1)

* fix method
* add test

not sure if there are any qtest which will react to this

> Fix StatsUtils.combineRange to combine intervals
> 
>
> Key: HIVE-18141
> URL: https://issues.apache.org/jira/browse/HIVE-18141
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18141.01.patch
>
>
> the current [combinedRange 
> implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
>  in its current form "combines" only ranges which contain eachother
> but the comments suggests that the intention was to capture the case when the 
> 2 intervals are overlap; can be checked with the following testcase:
> {code}
>   @Test
>   public void test11() {
> Range r1 = new Range(0, 1);
> Range r2 = new Range(1, 11);
> Range r3 = StatsUtils.combineRange(r1, r2);
> assertNotNull(r3);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18141) Fix StatsUtils.combineRange to combine intervals

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-18141:
---


> Fix StatsUtils.combineRange to combine intervals
> 
>
> Key: HIVE-18141
> URL: https://issues.apache.org/jira/browse/HIVE-18141
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> the current [combinedRange 
> implementation|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1984]
>  in its current form "combines" only ranges which contain eachother
> but the comments suggests that the intention was to capture the case when the 
> 2 intervals are overlap; can be checked with the following testcase:
> {code}
>   @Test
>   public void test11() {
> Range r1 = new Range(0, 1);
> Range r2 = new Range(1, 11);
> Range r3 = StatsUtils.combineRange(r1, r2);
> assertNotNull(r3);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18100) Some tests time out

2017-11-23 Thread LOKASHIS RANA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LOKASHIS RANA updated HIVE-18100:
-
Issue Type: Bug  (was: Test)

> Some tests time out
> ---
>
> Key: HIVE-18100
> URL: https://issues.apache.org/jira/browse/HIVE-18100
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 3.0.0
>
> Attachments: HIVE-18100.1.patch, HIVE-18100.2.patch, 
> HIVE-18100.3.patch, HIVE-18100.patch
>
>
> Some tests had 100s of queries in a single query which times out resulting in 
> Hive QA failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18100) Some tests time out

2017-11-23 Thread LOKASHIS RANA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LOKASHIS RANA updated HIVE-18100:
-
Issue Type: Test  (was: Bug)

> Some tests time out
> ---
>
> Key: HIVE-18100
> URL: https://issues.apache.org/jira/browse/HIVE-18100
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 3.0.0
>
> Attachments: HIVE-18100.1.patch, HIVE-18100.2.patch, 
> HIVE-18100.3.patch, HIVE-18100.patch
>
>
> Some tests had 100s of queries in a single query which times out resulting in 
> Hive QA failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13567) Enable auto-gather column stats by default

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264474#comment-16264474
 ] 

Hive QA commented on HIVE-13567:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
20s{color} | {color:red} standalone-metastore: The patch generated 3 new + 582 
unchanged - 2 fixed = 585 total (was 584) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
44s{color} | {color:red} root: The patch generated 3 new + 1531 unchanged - 2 
fixed = 1534 total (was 1533) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7989/yetus/diff-checkstyle-standalone-metastore.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7989/yetus/diff-checkstyle-root.txt
 |
| modules | C: common standalone-metastore ql accumulo-handler contrib 
hbase-handler . itests/hive-blobstore U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7989/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enable auto-gather column stats by default
> --
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, 
> HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, 
> HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, 
> HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, 
> HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, 
> HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch, 
> HIVE-13567.18.patch, HIVE-13567.19.patch, HIVE-13567.20.patch, 
> HIVE-13567.21.patch, HIVE-13567.22.patch, HIVE-13567.23wip01.patch, 
> 

[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: Patch Available  (was: In Progress)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Attachment: HIVE-17994.04.patch

4th time is a charm.

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, HIVE-17994.04.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: In Progress  (was: Patch Available)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17954) Implement pool, user, group and trigger to pool management API's.

2017-11-23 Thread Harish Jaiprakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Jaiprakash updated HIVE-17954:
-
Attachment: HIVE-17954.09.patch

Adding more tests.

> Implement pool, user, group and trigger to pool management API's.
> -
>
> Key: HIVE-17954
> URL: https://issues.apache.org/jira/browse/HIVE-17954
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17954.01.patch, HIVE-17954.02.patch, 
> HIVE-17954.03.patch, HIVE-17954.04.patch, HIVE-17954.05.patch, 
> HIVE-17954.06.patch, HIVE-17954.07.patch, HIVE-17954.08.patch, 
> HIVE-17954.09.patch
>
>
> Implement the following commands:
> -- Pool management.
> CREATE POOL `resource_plan`.`pool_path` WITH
>   ALLOC_FRACTION `fraction`
>   QUERY_PARALLELISM `parallelism`
>   SCHEDULING_POLICY `policy`;
> ALTER POOL `resource_plan`.`pool_path` SET
>   PATH = `new_path`,
>   ALLOC_FRACTION = `fraction`,
>   QUERY_PARALLELISM = `parallelism`,
>   SCHEDULING_POLICY = `policy`;
> DROP POOL `resource_plan`.`pool_path`;
> -- Trigger to pool mappings.
> ALTER RESOURCE PLAN `resource_plan`
>   ADD TRIGGER `trigger_name` TO `pool_path`;
> ALTER RESOURCE PLAN `resource_plan`
>   DROP TRIGGER `trigger_name` TO `pool_path`;
> -- User/Group to pool mappings.
> CREATE USER|GROUP MAPPING `resource_plan`.`group_or_user_name`
>   TO `pool_path` WITH ORDERING `order_no`;
> DROP USER|GROUP MAPPING `resource_plan`.`group_or_user_name`;



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264403#comment-16264403
 ] 

Zoltan Haindrich commented on HIVE-18138:
-

reattaching patch; all tests passed locally - not sure why they have failed

> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18138.01.patch, HIVE-18138.01.patch
>
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18138:

Attachment: HIVE-18138.01.patch

> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18138.01.patch, HIVE-18138.01.patch
>
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the selected columns size

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18108:

Attachment: HIVE-18108.01.patch

#1)

* fix the issue
* known sideeffect: estimations turn the basic stats to complete... 
* run the stats related qtests...results lookupdate some q.out-s possibly there 
will be others

> in case basic stats are missing; rowcount estimation depends on the selected 
> columns size
> -
>
> Key: HIVE-18108
> URL: https://issues.apache.org/jira/browse/HIVE-18108
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18108.01.patch
>
>
> in case basicstats are not available (especially rowcount):
> {code}
> set hive.stats.autogather=false;
> create table t (a integer, b string);
> insert into t values (1,'asd1');
> insert into t values (2,'asd2');
> insert into t values (3,'asd3');
> insert into t values (4,'asd4');
> insert into t values (5,'asd5');
> explain select a,count(1) from t group by a;
> -- estimated to read 8 rows from table t
> explain select b,count(1) from t group by b;
> -- estimated: 1 rows
> explain select a,b,count(1) from t group by a,b;
> -- estimated: 1 rows
> {code}
> it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the selected columns size

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18108:

Status: Patch Available  (was: Open)

> in case basic stats are missing; rowcount estimation depends on the selected 
> columns size
> -
>
> Key: HIVE-18108
> URL: https://issues.apache.org/jira/browse/HIVE-18108
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18108.01.patch
>
>
> in case basicstats are not available (especially rowcount):
> {code}
> set hive.stats.autogather=false;
> create table t (a integer, b string);
> insert into t values (1,'asd1');
> insert into t values (2,'asd2');
> insert into t values (3,'asd3');
> insert into t values (4,'asd4');
> insert into t values (5,'asd5');
> explain select a,count(1) from t group by a;
> -- estimated to read 8 rows from table t
> explain select b,count(1) from t group by b;
> -- estimated: 1 rows
> explain select a,b,count(1) from t group by a,b;
> -- estimated: 1 rows
> {code}
> it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the selected columns size

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-18108:
---

Assignee: Zoltan Haindrich

> in case basic stats are missing; rowcount estimation depends on the selected 
> columns size
> -
>
> Key: HIVE-18108
> URL: https://issues.apache.org/jira/browse/HIVE-18108
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> in case basicstats are not available (especially rowcount):
> {code}
> set hive.stats.autogather=false;
> create table t (a integer, b string);
> insert into t values (1,'asd1');
> insert into t values (2,'asd2');
> insert into t values (3,'asd3');
> insert into t values (4,'asd4');
> insert into t values (5,'asd5');
> explain select a,count(1) from t group by a;
> -- estimated to read 8 rows from table t
> explain select b,count(1) from t group by b;
> -- estimated: 1 rows
> explain select a,b,count(1) from t group by a,b;
> -- estimated: 1 rows
> {code}
> it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264297#comment-16264297
 ] 

Hive QA commented on HIVE-18138:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899033/HIVE-18138.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 133 failed/errored test(s), 11411 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_6] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_explain] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3]
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join11] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_comments] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_date] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_partitioned] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table3] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin10] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin11] 
(batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin12] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin13] 
(batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin8] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin9] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_1]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_decimal] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constGby] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_5] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic1] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic3] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[equal_ns] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_evolved_parts] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extract] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_limit] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby4_noskew] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew_multi_single_reducer]
 (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping]
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hook_order] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[implicit_cast1] 
(batchId=59)

[jira] [Updated] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18140:

Assignee: Zoltan Haindrich
  Status: Patch Available  (was: Open)

> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18140.01wip01.patch
>
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18140:

Attachment: HIVE-18140.01wip01.patch

#1)

* refactor {{BasicStats}} to make it easier to compute statistics
* it's a little rough around the edges :)

> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
> Attachments: HIVE-18140.01wip01.patch
>
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264276#comment-16264276
 ] 

Zoltan Haindrich commented on HIVE-18140:
-

HIVE-18108 also collaborates to mess-up this thing...I'll attach the refactored 
fix here - which is designed to not break many case ; it started fixing this 
issue because its what naturally does ; I started it as HIVE-18015 ... but it 
looks like it will be safer to separate the refactor and fixing all the issues.

> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264253#comment-16264253
 ] 

Hive QA commented on HIVE-18138:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} standalone-metastore: The patch generated 3 new + 577 
unchanged - 2 fixed = 580 total (was 579) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7988/yetus/diff-checkstyle-standalone-metastore.txt
 |
| modules | C: standalone-metastore U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7988/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18138.01.patch
>
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264239#comment-16264239
 ] 

Hive QA commented on HIVE-18052:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899030/HIVE-18052.5.patch

{color:green}SUCCESS:{color} +1 due to 90 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1307 failed/errored test(s), 9723 tests 
executed
*Failed tests:*
{noformat}
TestJdbcNonKrbSASLWithMiniKdc - did not produce a TEST-*.xml file (likely timed 
out) (batchId=245)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert]
 (batchId=236)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_10] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_12] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_16] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_1] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_2] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=244)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[buckets] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[create_like] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[explain] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore_nonpart]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse_nonpart]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_empty_into_blobstore]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[join] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_buckets] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_format_part]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[orc_nonstd_partitions_loc]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_buckets] 
(batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_format_part]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[rcfile_nonstd_partitions_loc]
 (batchId=247)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=247)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStatsPart] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alterColumnStats] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_concatenate_indexed_table]
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge] (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2] 
(batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2_orc] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_3] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=39)

[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264233#comment-16264233
 ] 

Hive QA commented on HIVE-18052:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} common: The patch generated 2 new + 934 unchanged - 1 
fixed = 936 total (was 935) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 41 new + 1755 unchanged - 7 
fixed = 1796 total (was 1762) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 9 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 6 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7987/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7987/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7987/yetus/whitespace-eol.txt 
|
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7987/yetus/whitespace-tabs.txt
 |
| modules | C: common standalone-metastore metastore ql service hcatalog/core 
hcatalog/hcatalog-pig-adapter hcatalog/server-extensions 
hcatalog/webhcat/java-client hcatalog/streaming itests/hcatalog-unit 
itests/hive-minikdc itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7987/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
> Attachments: HIVE-18052.1.patch, HIVE-18052.2.patch, 
> HIVE-18052.3.patch, HIVE-18052.4.patch, HIVE-18052.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-18115) Fix schema version info for Hive-2.3.2

2017-11-23 Thread Oleksiy Sayankin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264178#comment-16264178
 ] 

Oleksiy Sayankin edited comment on HIVE-18115 at 11/23/17 11:17 AM:


{quote}
Oleksiy Sayankin did you build the source code from the release-2.3.2 or you 
just downloaded the binaries?
{quote}

I built  source code from the release-2.3.2. This is my last 
[commit|https://github.com/apache/hive/commit/857a9fd8ad725a53bd95c1b2d6612f9b1155f44d].
 It worked fine after I had applied the patch, so I filed an issue here.


was (Author: osayankin):
{quote}
Oleksiy Sayankin did you build the source code from the release-2.3.2 or you 
just downloaded the binaries?
{quote}

I built  source code from the release-2.3.2. This is my last 
[commit|https://github.com/apache/hive/commit/857a9fd8ad725a53bd95c1b2d6612f9b1155f44d].

> Fix schema version info for Hive-2.3.2
> --
>
> Key: HIVE-18115
> URL: https://issues.apache.org/jira/browse/HIVE-18115
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Fix For: 2.3.3
>
> Attachments: HIVE-18115.02-branch-2.patch, HIVE-18115.1.patch
>
>
> Error while starting HiveMeta
> {code}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Hive Schema 
> version 2.3.2 does not match metastore's schema version 2.3.0 Metastore is 
> not upgraded or corrupt
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7600)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7563)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) 
> ~[hive-exec-2.3.2.jar:2.3.2]
> at com.sun.proxy.$Proxy23.verifySchema(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264181#comment-16264181
 ] 

Hive QA commented on HIVE-17994:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899027/HIVE-17994.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 11410 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[windowing_range_multiorder]
 (batchId=7)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression1 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression10 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression2 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression3 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression4 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression5 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression7 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression9 
(batchId=264)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=224)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=230)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=230)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testMultipleConditions_noTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testMultipleConditions_withTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_noTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withDateType
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withVariedCaseMappings
 (batchId=205)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7986/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7986/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7986/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12899027 - PreCommit-HIVE-Build

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-18115) Fix schema version info for Hive-2.3.2

2017-11-23 Thread Oleksiy Sayankin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264178#comment-16264178
 ] 

Oleksiy Sayankin edited comment on HIVE-18115 at 11/23/17 11:06 AM:


{quote}
Oleksiy Sayankin did you build the source code from the release-2.3.2 or you 
just downloaded the binaries?
{quote}

I built  source code from the release-2.3.2. This is my last 
[commit|https://github.com/apache/hive/commit/857a9fd8ad725a53bd95c1b2d6612f9b1155f44d].


was (Author: osayankin):
{quote}
Oleksiy Sayankin did you build the source code from the release-2.3.2 or you 
just downloaded the binaries?
{quote}

I built  source code from the release-2.3.2.

> Fix schema version info for Hive-2.3.2
> --
>
> Key: HIVE-18115
> URL: https://issues.apache.org/jira/browse/HIVE-18115
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Fix For: 2.3.3
>
> Attachments: HIVE-18115.02-branch-2.patch, HIVE-18115.1.patch
>
>
> Error while starting HiveMeta
> {code}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Hive Schema 
> version 2.3.2 does not match metastore's schema version 2.3.0 Metastore is 
> not upgraded or corrupt
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7600)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7563)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) 
> ~[hive-exec-2.3.2.jar:2.3.2]
> at com.sun.proxy.$Proxy23.verifySchema(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18115) Fix schema version info for Hive-2.3.2

2017-11-23 Thread Oleksiy Sayankin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264178#comment-16264178
 ] 

Oleksiy Sayankin commented on HIVE-18115:
-

{quote}
Oleksiy Sayankin did you build the source code from the release-2.3.2 or you 
just downloaded the binaries?
{quote}

I built  source code from the release-2.3.2.

> Fix schema version info for Hive-2.3.2
> --
>
> Key: HIVE-18115
> URL: https://issues.apache.org/jira/browse/HIVE-18115
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Minor
> Fix For: 2.3.3
>
> Attachments: HIVE-18115.02-branch-2.patch, HIVE-18115.1.patch
>
>
> Error while starting HiveMeta
> {code}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Hive Schema 
> version 2.3.2 does not match metastore's schema version 2.3.0 Metastore is 
> not upgraded or corrupt
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7600)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7563)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) 
> ~[hive-exec-2.3.2.jar:2.3.2]
> at com.sun.proxy.$Proxy23.verifySchema(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_141]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_141]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_141]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79)
>  ~[hive-exec-2.3.2.jar:2.3.2]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18139) spark may miss results in case column stats are gathered

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18139:

Attachment: missing_results.patch

see [^missing_results.patch] around the end for the lost resultsets.

> spark may miss results in case column stats are gathered
> 
>
> Key: HIVE-18139
> URL: https://issues.apache.org/jira/browse/HIVE-18139
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
> Attachments: missing_results.patch
>
>
> add {{set hive.stats.column.autogather=true;}} at the beginning of 
> {{ql/src/test/queries/clientpositive/auto_sortmerge_join_13.q}}  to repro.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264120#comment-16264120
 ] 

Hive QA commented on HIVE-17994:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} serde: The patch generated 1 new + 4 unchanged - 1 
fixed = 5 total (was 5) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / d9924ab |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7986/yetus/diff-checkstyle-serde.txt
 |
| modules | C: serde ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7986/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18108) in case basic stats are missing; rowcount estimation depends on the selected columns size

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18108:

Summary: in case basic stats are missing; rowcount estimation depends on 
the selected columns size  (was: in case basic stats are missing; rowcount 
estimation depends on the select columns size)

> in case basic stats are missing; rowcount estimation depends on the selected 
> columns size
> -
>
> Key: HIVE-18108
> URL: https://issues.apache.org/jira/browse/HIVE-18108
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>
> in case basicstats are not available (especially rowcount):
> {code}
> set hive.stats.autogather=false;
> create table t (a integer, b string);
> insert into t values (1,'asd1');
> insert into t values (2,'asd2');
> insert into t values (3,'asd3');
> insert into t values (4,'asd4');
> insert into t values (5,'asd5');
> explain select a,count(1) from t group by a;
> -- estimated to read 8 rows from table t
> explain select b,count(1) from t group by b;
> -- estimated: 1 rows
> explain select a,b,count(1) from t group by a,b;
> -- estimated: 1 rows
> {code}
> it may not depend on the actually selected column set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-13567) Enable auto-gather column stats by default

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13567:

Attachment: HIVE-13567.23wip07.patch

> Enable auto-gather column stats by default
> --
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, 
> HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, 
> HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, 
> HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, 
> HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, 
> HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch, 
> HIVE-13567.18.patch, HIVE-13567.19.patch, HIVE-13567.20.patch, 
> HIVE-13567.21.patch, HIVE-13567.22.patch, HIVE-13567.23wip01.patch, 
> HIVE-13567.23wip02.patch, HIVE-13567.23wip03.patch, HIVE-13567.23wip04.patch, 
> HIVE-13567.23wip05.patch, HIVE-13567.23wip06.patch, HIVE-13567.23wip07.patch
>
>
> in phase 2, we are going to set auto-gather column on as default. This needs 
> to update golden files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18138:

Attachment: HIVE-18138.01.patch

#1)

* metastore drops all the columnstats it doesn't know about anymore
* fix an NPE...not sure what consequences it had :)

> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18138.01.patch
>
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18138:

Status: Patch Available  (was: Open)

> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18138.01.patch
>
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18043) Vectorization: Support List type in MapWork

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264087#comment-16264087
 ] 

Hive QA commented on HIVE-18043:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899017/HIVE-18043.004.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11412 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=224)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=230)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7985/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7985/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7985/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12899017 - PreCommit-HIVE-Build

> Vectorization: Support List type in MapWork
> ---
>
> Key: HIVE-18043
> URL: https://issues.apache.org/jira/browse/HIVE-18043
> Project: Hive
>  Issue Type: Improvement
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-18043.001.patch, HIVE-18043.002.patch, 
> HIVE-18043.003.patch, HIVE-18043.004.patch
>
>
> Support Complex Types in vectorization is finished in HIVE-16589, but List 
> type is still not support in MapWork. It should be supported to improve the 
> performance when vectorization is enable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18138) Fix columnstats problem in case schema evolution

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-18138:
---


> Fix columnstats problem in case schema evolution
> 
>
> Key: HIVE-18138
> URL: https://issues.apache.org/jira/browse/HIVE-18138
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> column stats are kept in case the main table schema is altered; and this 
> causes all kind of problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18136) WorkloadManagerMxBean is missing the Apache license header

2017-11-23 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18136:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I've added an extra '*' in the header...looks like checkstyle is peeky about 
that...
pushed to master. Thank you [~asherman] for taking care of this!

> WorkloadManagerMxBean is missing the Apache license header
> --
>
> Key: HIVE-18136
> URL: https://issues.apache.org/jira/browse/HIVE-18136
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Fix For: 3.0.0
>
> Attachments: HIVE-18136.1.patch
>
>
> This causes warnings in the yetus check:
> {quote}Lines that start with ? in the ASF License  report indicate files 
> that do not have an Apache license header:
>  !? 
> /data/hiveptest/working/yetus/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManagerMxBean.java{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18136) WorkloadManagerMxBean is missing the Apache license header

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264080#comment-16264080
 ] 

Zoltan Haindrich commented on HIVE-18136:
-

+1

> WorkloadManagerMxBean is missing the Apache license header
> --
>
> Key: HIVE-18136
> URL: https://issues.apache.org/jira/browse/HIVE-18136
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-18136.1.patch
>
>
> This causes warnings in the yetus check:
> {quote}Lines that start with ? in the ASF License  report indicate files 
> that do not have an Apache license header:
>  !? 
> /data/hiveptest/working/yetus/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManagerMxBean.java{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18043) Vectorization: Support List type in MapWork

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264065#comment-16264065
 ] 

Hive QA commented on HIVE-18043:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 7 new + 684 unchanged - 0 
fixed = 691 total (was 684) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
34s{color} | {color:red} root: The patch generated 7 new + 690 unchanged - 0 
fixed = 697 total (was 690) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
11s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / c5c2986 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7985/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7985/yetus/diff-checkstyle-root.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7985/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql . itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7985/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Support List type in MapWork
> ---
>
> Key: HIVE-18043
> URL: https://issues.apache.org/jira/browse/HIVE-18043
> Project: Hive
>  Issue Type: Improvement
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-18043.001.patch, HIVE-18043.002.patch, 
> HIVE-18043.003.patch, HIVE-18043.004.patch
>
>
> Support Complex Types in vectorization is finished in HIVE-16589, but List 
> type is still not support in MapWork. It should be supported to improve the 
> performance when vectorization is enable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264054#comment-16264054
 ] 

Zoltan Haindrich commented on HIVE-18137:
-

I've found a bug in my changes; and the results are now going back to the old 
version;
I'm starting to convince myself that the old result is fine; since that column 
doesn't exists at that particular partition...

> Schema evolution: newly inserted column value in pre-existing partition is 
> masked to null
> -
>
> Key: HIVE-18137
> URL: https://issues.apache.org/jira/browse/HIVE-18137
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> {code}
> set hive.explain.user=false;
> set hive.fetch.task.conversion=none;
> set hive.mapred.mode=nonstrict;
> set hive.cli.print.header=true;
> SET hive.exec.schema.evolution=true;
> SET hive.vectorized.use.vectorized.input.format=true;
> SET hive.vectorized.use.vector.serde.deserialize=false;
> SET hive.vectorized.use.row.serde.deserialize=false;
> SET hive.vectorized.execution.enabled=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.metastore.disallow.incompatible.col.type.changes=true;
> set hive.default.fileformat=textfile;
> set hive.llap.io.enabled=false;
> CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
> PARTITIONED BY(part INT);
> insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
> , 'new');
> alter table part_add_int_permute_select add columns(c int);
> insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
> , 'new', );
> select insert_num,part,a,b,c from part_add_int_permute_select;
> {code}
> results for the last select:
> {code}
> 1  1   new NULL
> 2  1   new NULL
> {code}
> I think the following result should be expected:
> {code}
> 1  1   new NULL
> 2  1   new 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18052) Run p-tests on mm tables

2017-11-23 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-18052:
--
Attachment: HIVE-18052.5.patch

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
> Attachments: HIVE-18052.1.patch, HIVE-18052.2.patch, 
> HIVE-18052.3.patch, HIVE-18052.4.patch, HIVE-18052.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: Patch Available  (was: In Progress)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Attachment: HIVE-17994.03.patch

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> HIVE-17994.03.patch, vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17994:

Status: In Progress  (was: Patch Available)

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264026#comment-16264026
 ] 

Matt McCline commented on HIVE-17994:
-

Ok, with patch #3 -- trying out transients and no new non-transient members.

> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264023#comment-16264023
 ] 

Hive QA commented on HIVE-17994:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12899009/HIVE-17994.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 43 failed/errored test(s), 11410 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestBigintSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestBooleanSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestCharSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestDateSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestDecimalSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestDoubleSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestFloatSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestTimestampSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.TestVarcharSarg 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression1 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression10 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression2 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression3 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression4 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression5 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression7 
(batchId=264)
org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression9 
(batchId=264)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=224)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testMultiPartColsInData 
(batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=230)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=230)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers1 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=231)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerTotalTasks 
(batchId=231)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testMultipleConditions_noTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testMultipleConditions_withTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_noTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withDateType
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withTranslation
 (batchId=205)
org.apache.hive.storage.jdbc.TestQueryConditionBuilder.testSimpleCondition_withVariedCaseMappings
 (batchId=205)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7984/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7984/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7984/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing 

[jira] [Commented] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263985#comment-16263985
 ] 

Zoltan Haindrich commented on HIVE-18137:
-

I've done a deep dive in the history but its pretty hard to uncover when this 
have changed...there are many renames...multiple different versions of the same 
test (vec/nonvec;part/table;orc/text,x/y) - this is kinda like a matrix test...

> Schema evolution: newly inserted column value in pre-existing partition is 
> masked to null
> -
>
> Key: HIVE-18137
> URL: https://issues.apache.org/jira/browse/HIVE-18137
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> {code}
> set hive.explain.user=false;
> set hive.fetch.task.conversion=none;
> set hive.mapred.mode=nonstrict;
> set hive.cli.print.header=true;
> SET hive.exec.schema.evolution=true;
> SET hive.vectorized.use.vectorized.input.format=true;
> SET hive.vectorized.use.vector.serde.deserialize=false;
> SET hive.vectorized.use.row.serde.deserialize=false;
> SET hive.vectorized.execution.enabled=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.metastore.disallow.incompatible.col.type.changes=true;
> set hive.default.fileformat=textfile;
> set hive.llap.io.enabled=false;
> CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
> PARTITIONED BY(part INT);
> insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
> , 'new');
> alter table part_add_int_permute_select add columns(c int);
> insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
> , 'new', );
> select insert_num,part,a,b,c from part_add_int_permute_select;
> {code}
> results for the last select:
> {code}
> 1  1   new NULL
> 2  1   new NULL
> {code}
> I think the following result should be expected:
> {code}
> 1  1   new NULL
> 2  1   new 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (HIVE-18080) Performance degradation on VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled

2017-11-23 Thread liyunzhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang updated HIVE-18080:
--
Comment: was deleted

(was: 
It seems after If expression is vectorized by AVX2 instructions. It only 
consumes 2.8% of all total instructions and consumes very little cpu time(23ms) 
when the warm iter number is 5000 and iter number is 5. So the variation of 
this test is too big. Later will try to enlarge iterations of jmh test to see 
the degradation exists or not.
{code}
 To specify different parameters, use:
 * - This command will use 10 warm-up iterations, 5 test iterations, and 2 
forks. And it will
 * display the Average Time (avgt) in Microseconds (us)
 * - Benchmark mode. Available modes are:
 * [Throughput/thrpt, AverageTime/avgt, SampleTime/sample, SingleShotTime/ss, 
All/all]
 * - Output time unit. Available time units are: [m, s, ms, us, ns].
 * 
 * $ java -jar target/benchmarks.jar 
org.apache.hive.benchmark.vectorization.VectorizedLogicBench
 * -wi 10 -i 5 -f 2 -bm avgt -tu us
 */
{code})

> Performance degradation on 
> VectorizedLogicBench#IfExprLongColumnLongColumnBench when AVX512 is enabled
> --
>
> Key: HIVE-18080
> URL: https://issues.apache.org/jira/browse/HIVE-18080
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
> Attachments: IFExpression_AVX2_Instruction.png, 
> log.logic.avx1.single.0, log_logic.avx1.part
>
>
> Use  Xeon(R) Platinum 8180 CPU to test the performance of 
> [AVX512|https://en.wikipedia.org/wiki/AVX-512].
> {code}
> #cat /proc/cpuinfo |grep "model name"|head -n 1
> model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
> {code}
> Before that I have compiled hive with JDK9 as JDK9 enables AVX512 
> Use hive microbenchmark(HIVE-10189) to evaluate the performance improvement. 
> It seems performance(20%+) in cases in 
> {{VectorizedArithmeticBench}},{{VectorizedComparisonBench}},{{VectorizedLikeBench}},{{VectorizedLogicBench}}
>  execpt 
> {{VectorizedLogicBench#IfExprLongColumnLongColumnBench}},{{VectorizedLogicBench#IfExprRepeatingLongColumnLongColumnBench}}
>  and
> {{VectorizedLogicBench#IfExprLongColumnRepeatingLongColumnBench}}.The data is 
> like following
> When i use Skylake CPU to evaluate the performance improvement of AVX512.
> I found the performance in VectorizedLogicBench is like following
> || ||AVX2 us/op||AVX512 us/op ||  (AVX2-AVX512)/AVX2||
> |ColAndColBench|122510| 87014| 28.9%|
> |IfExprLongColumnLongColumnBench | 1325759| 1436073| -8.3% |
> |IfExprLongColumnRepeatingLongColumnBench|1397447|1480450|  -5.9%|
> |IfExprRepeatingLongColumnLongColumnBench|1401164|1483062|  -5.9% |
> |NotColBench|77042.83|51513.28|  33%|
> There are degradation in 
> IfExprLongColumnLongColumnBench,IfExprLongColumnRepeatingLongColumnBench, 
> IfExprRepeatingLongColumnLongColumnBench, very confused why there is 
> degradation on IfExprLongColumnLongColumnBench cases.
> Here we use {{taskset -cp 1 $pid}} to run the benchmark on single core to 
> avoid the impact of dynamic CPU frequency scaling.
> my script
> {code}
> export JAVA_HOME=/home/zly/jdk-9.0.1/
> export PATH=$JAVA_HOME/bin:$PATH
> export LD_LIBRARY_PATH=/home/zly/jdk-9.0.1/mylib
> for i in 0 1 2; do
> java -server -XX:UseAVX=3 -jar benchmarks.jar 
> org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 
> -f 1 -bm avgt -tu us >log.logic.avx3.single.$i & export pid=$!
> taskset -cp 1 $pid
> wait $pid
> done
> for i in 0 1 2; do
> java -server -XX:UseAVX=2 -jar benchmarks.jar 
> org.apache.hive.benchmark.vectorization.VectorizedLogicBench * -wi 10 -i 20 
> -f 1 -bm avgt -tu us >log.logic.avx2.single.$i & export pid=$!
> taskset -cp 1 $pid
> wait $pid
> done
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18137) Schema evolution: newly inserted column value in pre-existing partition is masked to null

2017-11-23 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263975#comment-16263975
 ] 

Zoltan Haindrich commented on HIVE-18137:
-

[~ekoifman],[~ashutoshc]: not entirely sure why...but this have somehow changed 
during one of my changes...I consider it an improvement...

I've tried to follow the historythru various refactors; renames/etc I've 
found an ancient version of this test which do had similar results:
https://github.com/apache/hive/blob/2f0339b08b375a1b656a178627600fc26c0a974c/ql/src/test/results/clientpositive/schema_evol_orc_nonvec_mapwork_part.q.out#L136


> Schema evolution: newly inserted column value in pre-existing partition is 
> masked to null
> -
>
> Key: HIVE-18137
> URL: https://issues.apache.org/jira/browse/HIVE-18137
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>
> {code}
> set hive.explain.user=false;
> set hive.fetch.task.conversion=none;
> set hive.mapred.mode=nonstrict;
> set hive.cli.print.header=true;
> SET hive.exec.schema.evolution=true;
> SET hive.vectorized.use.vectorized.input.format=true;
> SET hive.vectorized.use.vector.serde.deserialize=false;
> SET hive.vectorized.use.row.serde.deserialize=false;
> SET hive.vectorized.execution.enabled=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.metastore.disallow.incompatible.col.type.changes=true;
> set hive.default.fileformat=textfile;
> set hive.llap.io.enabled=false;
> CREATE TABLE part_add_int_permute_select(insert_num int, a INT, b STRING) 
> PARTITIONED BY(part INT);
> insert into table part_add_int_permute_select partition(part=1) VALUES (1, 
> , 'new');
> alter table part_add_int_permute_select add columns(c int);
> insert into table part_add_int_permute_select partition(part=1) VALUES (2, 
> , 'new', );
> select insert_num,part,a,b,c from part_add_int_permute_select;
> {code}
> results for the last select:
> {code}
> 1  1   new NULL
> 2  1   new NULL
> {code}
> I think the following result should be expected:
> {code}
> 1  1   new NULL
> 2  1   new 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18043) Vectorization: Support List type in MapWork

2017-11-23 Thread Colin Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-18043:

Attachment: HIVE-18043.004.patch

[~Ferd], the patch is updated according to your comments, thanks for review.

> Vectorization: Support List type in MapWork
> ---
>
> Key: HIVE-18043
> URL: https://issues.apache.org/jira/browse/HIVE-18043
> Project: Hive
>  Issue Type: Improvement
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-18043.001.patch, HIVE-18043.002.patch, 
> HIVE-18043.003.patch, HIVE-18043.004.patch
>
>
> Support Complex Types in vectorization is finished in HIVE-16589, but List 
> type is still not support in MapWork. It should be supported to improve the 
> performance when vectorization is enable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17994) Vectorization: Serialization bottlenecked on irrelevant hashmap lookup

2017-11-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263954#comment-16263954
 ] 

Hive QA commented on HIVE-17994:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
59s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} serde: The patch generated 1 new + 4 unchanged - 1 
fixed = 5 total (was 5) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / c5c2986 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7984/yetus/diff-checkstyle-serde.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7984/yetus/patch-asflicense-problems.txt
 |
| modules | C: serde ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-7984/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Serialization bottlenecked on irrelevant hashmap lookup
> --
>
> Key: HIVE-17994
> URL: https://issues.apache.org/jira/browse/HIVE-17994
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Matt McCline
>Priority: Minor
> Attachments: HIVE-17994.01.patch, HIVE-17994.02.patch, 
> vec-serialize-hashmap.png
>
>
> On machines with slower NUMA, the hashmap lookup for 
> TypeInfo::getPrimitiveCategory is the slowest part of the vectorized 
> serialization loops. The static object references run hot with the NUMA 
> access speeds penalizing half the threads.
> This lookup is done for every column, for every row - though vectorization 
> enforces that this type cannot change at all.
> !vec-serialize-hashmap.png!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)