date:20180723



[ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553841#comment-16553841
 ] 

Hive QA commented on HIVE-19532:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932786/HIVE-19532.25.patch

{color:green}SUCCESS:{color} +1 due to 26 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14703 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_part2] (batchId=21)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12813/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12813/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12813/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932786 - PreCommit-HIVE-Build

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.23.patch, HIVE-19532.24.patch, 
> HIVE-19532.25.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20210) Simple Fetch optimizer should lead to MapReduce when filter on non-partition column and conversion is minimal

2018-07-23 Thread Jeffrey(Xilang) Yan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey(Xilang) Yan updated HIVE-20210:
---
Status: In Progress  (was: Patch Available)

> Simple Fetch optimizer should lead to MapReduce when filter on non-partition 
> column and conversion is minimal
> -
>
> Key: HIVE-20210
> URL: https://issues.apache.org/jira/browse/HIVE-20210
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.3.2, 2.3.1, 2.3.0
>Reporter: Jeffrey(Xilang) Yan
>Assignee: Jeffrey(Xilang) Yan
>Priority: Major
> Attachments: HIVE-20210.2.patch, HIVE-20210.patch
>
>
> When conversion is minimal, simple fetch can be used only when filter on 
> partition column or no filter at all. But it lead to simple fetch even if 
> filter on non-partition column. Unit test " select * from srcpart where key > 
> 100 limit 10 " in the nonmr_fetch.q demonstration this issue – the unit test 
> is not correct indeed(it should be Map Reduce  but in test it is Simple 
> Fetch).
> This issue lead to a serious problem when data size is huge. When conversion 
> is more and filter on both partition column and non-partition column, it will 
> not chech hive.fetch.task.conversion.threshold, which lead to the query to 
> takes hours to finish. This issue doesn't exist in 1.2.1, how it works should 
> be a magic...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare

2018-07-23 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20207:

Status: Patch Available  (was: In Progress)

> Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
> --
>
> Key: HIVE-20207
> URL: https://issues.apache.org/jira/browse/HIVE-20207
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20207.01.patch, HIVE-20207.02.patch, 
> HIVE-20207.03.patch, HIVE-20207.04.patch, HIVE-20207.05.patch, 
> HIVE-20207.06.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized filter and compare.
> BUGS:
> 1) LongColLessLongColumn SIMD optimization do not work for very large 
> integers:
>  -7272907770454997143 < 8976171455044006767
>  outputVector[i] = (vector1[i] - vector2[i]) >>> 63;
>  Produces 0 instead of 1...
> Also, add DECIMAL_64 testing. Add missing DECIMAL/DECIMAL_64 Comparison and 
> IF vectorized expression classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553805#comment-16553805
 ] 

Hive QA commented on HIVE-20082:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932789/HIVE-20082.4.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14685 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12812/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12812/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12812/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932789 - PreCommit-HIVE-Build

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553790#comment-16553790
 ] 

Hive QA commented on HIVE-20082:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} accumulo-handler in master has 21 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} accumulo-handler in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} serde: The patch generated 4 new + 299 unchanged - 0 
fixed = 303 total (was 299) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} accumulo-handler: The patch generated 1 new + 53 
unchanged - 0 fixed = 54 total (was 53) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 14 new + 899 unchanged - 10 
fixed = 913 total (was 909) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12812/dev-support/hive-personality.sh
 |
| git revision | master / 87b9f64 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12812/yetus/patch-mvninstall-accumulo-handler.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12812/yetus/diff-checkstyle-serde.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12812/yetus/diff-checkstyle-accumulo-handler.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12812/yetus/diff-checkstyle-ql.txt
 |
| modules | C: serde accumulo-handler ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12812/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
>

[jira] [Commented] (HIVE-19251) ObjectStore.getNextNotification with LIMIT should use less memory



[ 
https://issues.apache.org/jira/browse/HIVE-19251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553770#comment-16553770
 ] 

Hive QA commented on HIVE-19251:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12919920/HIVE-19251.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12811/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12811/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12811/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-24 03:47:10.191
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12811/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-24 03:47:10.194
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   ed4fa73..87b9f64  master -> origin/master
+ git reset --hard HEAD
HEAD is now at ed4fa73 HIVE-19733: RemoteSparkJobStatus#getSparkStageProgress 
inefficient implementation (Bharathkrishna Guruvayoor Murali, reviewed by Sahil 
Takiar)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 87b9f64 HIVE-20164 : Murmur Hash : Make sure CTAS and IAS use 
correct bucketing version (Deepak Jaiswal, reviewed by Jason Dere)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-24 03:47:11.831
+ rm -rf ../yetus_PreCommit-HIVE-Build-12811
+ mkdir ../yetus_PreCommit-HIVE-Build-12811
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12811
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12811/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:
 does not exist in index
error: 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:
 does not exist in index
error: src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java: does 
not exist in index
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-12811
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12919920 - PreCommit-HIVE-Build

> ObjectStore.getNextNotification with LIMIT should use less memory
> -
>
> Key: HIVE-19251
> URL: https://issues.apache.org/jira/browse/HIVE-19251
> Project: Hive
>  Issue Type: Bug
>  Components: repl, Standalone Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-19251.1.patch
>
>
> Experience OOM when Hive metastore try to retrieve huge amount of 
> notification logs even there's limit clause. Hive shall only retrieve 
> necessary rows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20228) configure repl configuration directories based on user running hiveserver2

2018-07-23 Thread anishek (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-20228:
---
Status: Patch Available  (was: Open)

> configure repl configuration directories based on user running hiveserver2
> --
>
> Key: HIVE-20228
> URL: https://issues.apache.org/jira/browse/HIVE-20228
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20228.1.patch
>
>
> If a custom  uesr is used to run hive server 2, then repl subsystem should 
> use the directories within the user's home directory for various 
> configurations rather than use the default /user/hive/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20228) configure repl configuration directories based on user running hiveserver2

2018-07-23 Thread anishek (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-20228:
---
Attachment: HIVE-20228.1.patch

> configure repl configuration directories based on user running hiveserver2
> --
>
> Key: HIVE-20228
> URL: https://issues.apache.org/jira/browse/HIVE-20228
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20228.1.patch
>
>
> If a custom  uesr is used to run hive server 2, then repl subsystem should 
> use the directories within the user's home directory for various 
> configurations rather than use the default /user/hive/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20169) Print Final Rows Processed in MapOperator



[ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553769#comment-16553769
 ] 

Hive QA commented on HIVE-20169:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932778/HIVE-20169.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14683 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_joins]
 (batchId=193)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_masking]
 (batchId=193)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12810/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12810/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12810/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932778 - PreCommit-HIVE-Build

> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20228) configure repl configuration directories based on user running hiveserver2

2018-07-23 Thread anishek (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-20228:
--


> configure repl configuration directories based on user running hiveserver2
> --
>
> Key: HIVE-20228
> URL: https://issues.apache.org/jira/browse/HIVE-20228
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Major
> Fix For: 4.0.0
>
>
> If a custom  uesr is used to run hive server 2, then repl subsystem should 
> use the directories within the user's home directory for various 
> configurations rather than use the default /user/hive/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version

2018-07-23 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20164:
--
Attachment: HIVE-20164.01-branch-3.patch

> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.01-branch-3.patch, HIVE-20164.1.patch, 
> HIVE-20164.2.patch, HIVE-20164.3.patch, HIVE-20164.4.patch, 
> HIVE-20164.5.patch, HIVE-20164.6.patch, HIVE-20164.7.patch, HIVE-20164.8.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-07-23 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-19985:
-

Assignee: Eugene Koifman  (was: Gopal V)

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare

2018-07-23 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20207:

Attachment: HIVE-20207.06.patch

> Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
> --
>
> Key: HIVE-20207
> URL: https://issues.apache.org/jira/browse/HIVE-20207
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20207.01.patch, HIVE-20207.02.patch, 
> HIVE-20207.03.patch, HIVE-20207.04.patch, HIVE-20207.05.patch, 
> HIVE-20207.06.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized filter and compare.
> BUGS:
> 1) LongColLessLongColumn SIMD optimization do not work for very large 
> integers:
>  -7272907770454997143 < 8976171455044006767
>  outputVector[i] = (vector1[i] - vector2[i]) >>> 63;
>  Produces 0 instead of 1...
> Also, add DECIMAL_64 testing. Add missing DECIMAL/DECIMAL_64 Comparison and 
> IF vectorized expression classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare

2018-07-23 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20207:

Status: In Progress  (was: Patch Available)

> Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
> --
>
> Key: HIVE-20207
> URL: https://issues.apache.org/jira/browse/HIVE-20207
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20207.01.patch, HIVE-20207.02.patch, 
> HIVE-20207.03.patch, HIVE-20207.04.patch, HIVE-20207.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized filter and compare.
> BUGS:
> 1) LongColLessLongColumn SIMD optimization do not work for very large 
> integers:
>  -7272907770454997143 < 8976171455044006767
>  outputVector[i] = (vector1[i] - vector2[i]) >>> 63;
>  Produces 0 instead of 1...
> Also, add DECIMAL_64 testing. Add missing DECIMAL/DECIMAL_64 Comparison and 
> IF vectorized expression classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20169) Print Final Rows Processed in MapOperator



[ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553736#comment-16553736
 ] 

Hive QA commented on HIVE-20169:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12810/dev-support/hive-personality.sh
 |
| git revision | master / ed4fa73 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12810/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version



[ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553706#comment-16553706
 ] 

Hive QA commented on HIVE-20164:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932763/HIVE-20164.8.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12809/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12809/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12809/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12932763/HIVE-20164.8.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932763 - PreCommit-HIVE-Build

> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.1.patch, HIVE-20164.2.patch, 
> HIVE-20164.3.patch, HIVE-20164.4.patch, HIVE-20164.5.patch, 
> HIVE-20164.6.patch, HIVE-20164.7.patch, HIVE-20164.8.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20168) ReduceSinkOperator Logging Hidden



[ 
https://issues.apache.org/jira/browse/HIVE-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553704#comment-16553704
 ] 

Hive QA commented on HIVE-20168:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932776/HIVE-20168.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12808/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12808/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12808/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12932776/HIVE-20168.2.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932776 - PreCommit-HIVE-Build

> ReduceSinkOperator Logging Hidden
> -
>
> Key: HIVE-20168
> URL: https://issues.apache.org/jira/browse/HIVE-20168
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20168.1.patch, HIVE-20168.2.patch
>
>
> [https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java]
>  
> {code:java}
> if (LOG.isTraceEnabled()) {
>   if (numRows == cntr) {
> cntr = logEveryNRows == 0 ? cntr * 10 : numRows + logEveryNRows;
> if (cntr < 0 || numRows < 0) {
>   cntr = 0;
>   numRows = 1;
> }
> LOG.info(toString() + ": records written - " + numRows);
>   }
> }
> ...
> if (LOG.isTraceEnabled()) {
>   LOG.info(toString() + ": records written - " + numRows);
> }
> {code}
> There are logging guards here checking for TRACE level debugging but the 
> logging is actually INFO.  This is important logging for detecting data skew. 
>  Please change guards to check for INFO... or I would prefer that the guards 
> are removed altogether since it's very rare that a service is running with 
> only WARN level logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20168) ReduceSinkOperator Logging Hidden



[ 
https://issues.apache.org/jira/browse/HIVE-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553703#comment-16553703
 ] 

Hive QA commented on HIVE-20168:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932776/HIVE-20168.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14683 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12807/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12807/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12807/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932776 - PreCommit-HIVE-Build

> ReduceSinkOperator Logging Hidden
> -
>
> Key: HIVE-20168
> URL: https://issues.apache.org/jira/browse/HIVE-20168
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20168.1.patch, HIVE-20168.2.patch
>
>
> [https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java]
>  
> {code:java}
> if (LOG.isTraceEnabled()) {
>   if (numRows == cntr) {
> cntr = logEveryNRows == 0 ? cntr * 10 : numRows + logEveryNRows;
> if (cntr < 0 || numRows < 0) {
>   cntr = 0;
>   numRows = 1;
> }
> LOG.info(toString() + ": records written - " + numRows);
>   }
> }
> ...
> if (LOG.isTraceEnabled()) {
>   LOG.info(toString() + ": records written - " + numRows);
> }
> {code}
> There are logging guards here checking for TRACE level debugging but the 
> logging is actually INFO.  This is important logging for detecting data skew. 
>  Please change guards to check for INFO... or I would prefer that the guards 
> are removed altogether since it's very rare that a service is running with 
> only WARN level logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19424) NPE In MetaDataFormatters



[ 
https://issues.apache.org/jira/browse/HIVE-19424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553702#comment-16553702
 ] 

Alice Fan commented on HIVE-19424:
--

Hi [~ychena],
Could you please help to commit HIVE-19424.3.patch to Hive 3.2.0? The patch has 
no change to the one committed to master branch.
Let me know if you have any question.

Thanks,
Alice

> NPE In MetaDataFormatters
> -
>
> Key: HIVE-19424
> URL: https://issues.apache.org/jira/browse/HIVE-19424
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Standalone Metastore
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HIVE-19424.1.patch, HIVE-19424.2.patch, 
> HIVE-19424.3.patch
>
>
> h2. Overview
> According to the Hive Schema definition, a table's {{INPUT_FORMAT}} class can 
> be set to NULL.  However, there are places in the code where we do not 
> account for this NULL value, in particular the {{MetaDataFormatters}} classes 
> {{TextMetaDataFormatter}} and {{JsonMetaDataFormatter}}.  In addition, there 
> is no debug level logging in the {{MetaDataFormatters}} classes to tell me 
> which table in particular is causing the problem.
> {code:sql|title=hive-schema-2.2.0.mysql.sql}
> CREATE TABLE IF NOT EXISTS `SDS` (
>   `SD_ID` bigint(20) NOT NULL,
>   `CD_ID` bigint(20) DEFAULT NULL,
>   `INPUT_FORMAT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin 
> DEFAULT NULL,
>   `IS_COMPRESSED` bit(1) NOT NULL,
> ...
> {code}
> {code:java|title=TextMetaDataFormatter.java}
> // Not checking for a null return from getInputFormatClass
> inputFormattCls = par.getInputFormatClass().getName();
> outputFormattCls = par.getOutputFormatClass().getName();
> {code}
> h2. Reproduction
> {code:sql}
> -- MySQL Backend
> update SDS SET INPUT_FORMAT=NULL WHERE SD_ID=XXX;
> {code}
> {code}
> // Hive
> SHOW TABLE EXTENDED FROM default LIKE '*';
> // HS2 Logs
> [HiveServer2-Background-Pool: Thread-464]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Exception while processing show table 
> status
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Exception while 
> processing show table status
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3025)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:405)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2052)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1748)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
>   ... 11 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.showTableStatus(TextMetaDataFormatter.java:202)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3020)
>   ... 20 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19424) NPE In MetaDataFormatters



 [ 
https://issues.apache.org/jira/browse/HIVE-19424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19424:
-
Fix Version/s: (was: 4.0.0)
   3.2.0
 Release Note: HIVE-19424 : Fixing NPE In MetaDataFormatters at Branch-3
Affects Version/s: (was: 2.4.0)
   (was: 3.0.0)
   3.2.0
   Attachment: HIVE-19424.3.patch
   Status: Patch Available  (was: Reopened)

> NPE In MetaDataFormatters
> -
>
> Key: HIVE-19424
> URL: https://issues.apache.org/jira/browse/HIVE-19424
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Standalone Metastore
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HIVE-19424.1.patch, HIVE-19424.2.patch, 
> HIVE-19424.3.patch
>
>
> h2. Overview
> According to the Hive Schema definition, a table's {{INPUT_FORMAT}} class can 
> be set to NULL.  However, there are places in the code where we do not 
> account for this NULL value, in particular the {{MetaDataFormatters}} classes 
> {{TextMetaDataFormatter}} and {{JsonMetaDataFormatter}}.  In addition, there 
> is no debug level logging in the {{MetaDataFormatters}} classes to tell me 
> which table in particular is causing the problem.
> {code:sql|title=hive-schema-2.2.0.mysql.sql}
> CREATE TABLE IF NOT EXISTS `SDS` (
>   `SD_ID` bigint(20) NOT NULL,
>   `CD_ID` bigint(20) DEFAULT NULL,
>   `INPUT_FORMAT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin 
> DEFAULT NULL,
>   `IS_COMPRESSED` bit(1) NOT NULL,
> ...
> {code}
> {code:java|title=TextMetaDataFormatter.java}
> // Not checking for a null return from getInputFormatClass
> inputFormattCls = par.getInputFormatClass().getName();
> outputFormattCls = par.getOutputFormatClass().getName();
> {code}
> h2. Reproduction
> {code:sql}
> -- MySQL Backend
> update SDS SET INPUT_FORMAT=NULL WHERE SD_ID=XXX;
> {code}
> {code}
> // Hive
> SHOW TABLE EXTENDED FROM default LIKE '*';
> // HS2 Logs
> [HiveServer2-Background-Pool: Thread-464]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Exception while processing show table 
> status
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Exception while 
> processing show table status
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3025)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:405)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2052)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1748)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
>   ... 11 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.showTableStatus(TextMetaDataFormatter.java:202)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3020)
>   ... 20 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20109) get rid of COLUMN_STATS_ACCURATE



[ 
https://issues.apache.org/jira/browse/HIVE-20109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553696#comment-16553696
 ] 

Sergey Shelukhin edited comment on HIVE-20109 at 7/24/18 2:28 AM:
--

I think the plan is to make this a breaking change for 4.0, should be ok for a 
major version - there will no longer be json or even stats storage as part of 
table parameters, and there won't be a runtime backward compat that would 
generate json values from normalized storage.
There will be an upgrade option to transfer the json object into the new 
fields; given that the consequence of not running the upgrade script is a 
one-time loss of accurate stats, this should be acceptable.
I'm looking at the code to see how easy it is to normalize table stats storage 
into a separate table, so that the TBLS and PARTITIONS are not even affected by 
stats changes (that is good for CachedStore).
Regardless, as is already the case with column stats, the basic stats state 
will be updated via a separate API from alter table to make it more explicit.



was (Author: sershe):
I think the plan is to make this a breaking change for 4.0, should be ok for a 
major version - there will no longer be json or even stats storage as part of 
table parameters.
There will be an upgrade option to transfer the json object into the new 
fields; given that the consequence of not running the upgrade script is a 
one-time loss of accurate stats, this should be acceptable.
I'm looking at the code to see how easy it is to normalize table stats storage 
into a separate table, so that the TBLS and PARTITIONS are not even affected by 
stats changes (that is good for CachedStore).
Regardless, as is already the case with column stats, the basic stats state 
will be updated via a separate API from alter table to make it more explicit.


> get rid of COLUMN_STATS_ACCURATE
> 
>
> Key: HIVE-20109
> URL: https://issues.apache.org/jira/browse/HIVE-20109
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> I don't know why anyone would come up with an idea of storing a set of 
> booleans in a database using JSON. This has caused various problems in the 
> past (text field limitations, perf issues when parsing a giant string; also 
> bugs because the way it is set is brittle).
> However, now that we are implementing transactional stats, it becomes 
> especially problematic and error prone because the code in Hive sets C_S_A in 
> random places with reckless abandon, whereas we want to change the state of 
> the stats in well defined places where txn semantics can be verified.
> Currently in HIVE-19416, we are handling random things that touch it (from 
> metastore itself to output committers, various stats tasks, commands like 
> truncate, etc.) via a pile of hacks, but the best solution would be to remove 
> it completely and replace with a DB table/columns in stats tables that would 
> need to be set explicitly, not via generic alter_table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20109) get rid of COLUMN_STATS_ACCURATE



[ 
https://issues.apache.org/jira/browse/HIVE-20109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553696#comment-16553696
 ] 

Sergey Shelukhin commented on HIVE-20109:
-

I think the plan is to make this a breaking change for 4.0, should be ok for a 
major version - there will no longer be json or even stats storage as part of 
table parameters.
There will be an upgrade option to transfer the json object into the new 
fields; given that the consequence of not running the upgrade script is a 
one-time loss of accurate stats, this should be acceptable.
I'm looking at the code to see how easy it is to normalize table stats storage 
into a separate table, so that the TBLS and PARTITIONS are not even affected by 
stats changes (that is good for CachedStore).
Regardless, as is already the case with column stats, the basic stats state 
will be updated via a separate API from alter table to make it more explicit.


> get rid of COLUMN_STATS_ACCURATE
> 
>
> Key: HIVE-20109
> URL: https://issues.apache.org/jira/browse/HIVE-20109
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> I don't know why anyone would come up with an idea of storing a set of 
> booleans in a database using JSON. This has caused various problems in the 
> past (text field limitations, perf issues when parsing a giant string; also 
> bugs because the way it is set is brittle).
> However, now that we are implementing transactional stats, it becomes 
> especially problematic and error prone because the code in Hive sets C_S_A in 
> random places with reckless abandon, whereas we want to change the state of 
> the stats in well defined places where txn semantics can be verified.
> Currently in HIVE-19416, we are handling random things that touch it (from 
> metastore itself to output committers, various stats tasks, commands like 
> truncate, etc.) via a pile of hacks, but the best solution would be to remove 
> it completely and replace with a DB table/columns in stats tables that would 
> need to be set explicitly, not via generic alter_table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20177) Vectorization: Reduce KeyWrapper allocation in GroupBy Streaming mode

2018-07-23 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553690#comment-16553690
 ] 

Gopal V commented on HIVE-20177:


[~mmccline]: this is a perf fix for very large gbys - clean test-run.

> Vectorization: Reduce KeyWrapper allocation in GroupBy Streaming mode
> -
>
> Key: HIVE-20177
> URL: https://issues.apache.org/jira/browse/HIVE-20177
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: performance
> Attachments: HIVE-20177.01.patch, HIVE-20177.WIP.patch
>
>
> The streaming mode for VectorGroupBy allocates a large number of arrays due 
> to VectorKeyHashWrapper::duplicateTo()
> Since the vectors can't be mutated in-place while a single batch is being 
> processed, this operation can be cut by 1000x by allocating a streaming key 
> at the end of the loop, instead of reallocating within the loop.
> {code}
>   for(int i = 0; i < batch.size; ++i) {
> if (!batchKeys[i].equals(streamingKey)) {
>   // We've encountered a new key, must save current one
>   // We can't forward yet, the aggregators have not been evaluated
>   rowsToFlush[flushMark] = currentStreamingAggregators;
>   if (keysToFlush[flushMark] == null) {
> keysToFlush[flushMark] = (VectorHashKeyWrapper) 
> streamingKey.copyKey();
>   } else {
> streamingKey.duplicateTo(keysToFlush[flushMark]);
>   }
>   currentStreamingAggregators = 
> streamAggregationBufferRowPool.getFromPool();
>   batchKeys[i].duplicateTo(streamingKey);
>   ++flushMark;
> }
> {code}
> The duplicateTo can be pushed out of the loop since there only one to truly 
> keep a copy of is the last unique key in the VRB.
> The actual byte[] values within the keys are safely copied out by - 
> VectorHashKeyWrapperBatch.assignRowColumn() which calls setVal() and not 
> setRef().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (HIVE-19424) NPE In MetaDataFormatters



 [ 
https://issues.apache.org/jira/browse/HIVE-19424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan reopened HIVE-19424:
--

> NPE In MetaDataFormatters
> -
>
> Key: HIVE-19424
> URL: https://issues.apache.org/jira/browse/HIVE-19424
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Standalone Metastore
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-19424.1.patch, HIVE-19424.2.patch
>
>
> h2. Overview
> According to the Hive Schema definition, a table's {{INPUT_FORMAT}} class can 
> be set to NULL.  However, there are places in the code where we do not 
> account for this NULL value, in particular the {{MetaDataFormatters}} classes 
> {{TextMetaDataFormatter}} and {{JsonMetaDataFormatter}}.  In addition, there 
> is no debug level logging in the {{MetaDataFormatters}} classes to tell me 
> which table in particular is causing the problem.
> {code:sql|title=hive-schema-2.2.0.mysql.sql}
> CREATE TABLE IF NOT EXISTS `SDS` (
>   `SD_ID` bigint(20) NOT NULL,
>   `CD_ID` bigint(20) DEFAULT NULL,
>   `INPUT_FORMAT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin 
> DEFAULT NULL,
>   `IS_COMPRESSED` bit(1) NOT NULL,
> ...
> {code}
> {code:java|title=TextMetaDataFormatter.java}
> // Not checking for a null return from getInputFormatClass
> inputFormattCls = par.getInputFormatClass().getName();
> outputFormattCls = par.getOutputFormatClass().getName();
> {code}
> h2. Reproduction
> {code:sql}
> -- MySQL Backend
> update SDS SET INPUT_FORMAT=NULL WHERE SD_ID=XXX;
> {code}
> {code}
> // Hive
> SHOW TABLE EXTENDED FROM default LIKE '*';
> // HS2 Logs
> [HiveServer2-Background-Pool: Thread-464]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Exception while processing show table 
> status
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Exception while 
> processing show table status
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3025)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:405)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2052)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1748)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
>   ... 11 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.showTableStatus(TextMetaDataFormatter.java:202)
>   at 
> org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:3020)
>   ... 20 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16882) Improvements For Avro SerDe Package



[ 
https://issues.apache.org/jira/browse/HIVE-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553667#comment-16553667
 ] 

Hive QA commented on HIVE-16882:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932769/HIVE-16882.9.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 14683 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_charvarchar] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_date] (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_deserialize_map_null]
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_nullable_fields] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_timestamp] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_null] 
(batchId=89)
org.apache.hadoop.hive.hbase.TestHBaseSerDe.testHBaseSerDeWithAvroExternalSchema
 (batchId=195)
org.apache.hadoop.hive.hbase.TestHBaseSerDe.testHBaseSerDeWithAvroSerClass 
(batchId=195)
org.apache.hadoop.hive.hbase.TestHBaseSerDe.testHBaseSerDeWithHiveMapToHBaseAvroColumnFamily
 (batchId=195)
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeMapWithNullablePrimitiveValues
 (batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeMapsWithJavaLangStringKeys
 (batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeMapsWithPrimitiveKeys
 (batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroSerializer.canSerializeMapOfDecimals 
(batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroSerializer.canSerializeMaps 
(batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroSerializer.canSerializeMapsWithNullableComplexValues
 (batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroSerializer.canSerializeMapsWithNullablePrimitiveValues
 (batchId=318)
org.apache.hadoop.hive.serde2.avro.TestAvroSerializer.canSerializeNullableMaps 
(batchId=318)
org.apache.hive.hcatalog.pig.TestAvroHCatLoader.testColumnarStorePushdown 
(batchId=205)
org.apache.hive.hcatalog.pig.TestAvroHCatLoader.testColumnarStorePushdown2 
(batchId=205)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData[5]
 (batchId=205)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[5]
 (batchId=205)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12806/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12806/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12806/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932769 - PreCommit-HIVE-Build

> Improvements For Avro SerDe Package
> ---
>
> Key: HIVE-16882
> URL: https://issues.apache.org/jira/browse/HIVE-16882
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16882.1.patch, HIVE-16882.2.patch, 
> HIVE-16882.3.patch, HIVE-16882.4.patch, HIVE-16882.5.patch, 
> HIVE-16882.6.patch, HIVE-16882.7.patch, HIVE-16882.8.patch, HIVE-16882.9.patch
>
>
> # Use SLF4J parameter DEBUG logging
> # Use re-usable libraries where appropriate
> # Use enhanced for loops where appropriate
> # Fix several minor check-style error
> # Small performance enhancements in InstanceCache



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare

2018-07-23 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553665#comment-16553665
 ] 

Teddy Choi commented on HIVE-20207:
---

LGTM +1. Thanks for fixing my mistakes. :D

> Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
> --
>
> Key: HIVE-20207
> URL: https://issues.apache.org/jira/browse/HIVE-20207
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20207.01.patch, HIVE-20207.02.patch, 
> HIVE-20207.03.patch, HIVE-20207.04.patch, HIVE-20207.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized filter and compare.
> BUGS:
> 1) LongColLessLongColumn SIMD optimization do not work for very large 
> integers:
>  -7272907770454997143 < 8976171455044006767
>  outputVector[i] = (vector1[i] - vector2[i]) >>> 63;
>  Produces 0 instead of 1...
> Also, add DECIMAL_64 testing. Add missing DECIMAL/DECIMAL_64 Comparison and 
> IF vectorized expression classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20117) schema changes for txn stats



[ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553649#comment-16553649
 ] 

Vineet Garg commented on HIVE-20117:


I have already created RC for 3.1 since this wasn’t blocker. This is not 
required for 3.1 (branch-3.1) anymore but this probably should go for 3.2 in 
branch-3.

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch, HIVE-20117.02.patch, 
> HIVE-20117.03.patch, HIVE-20117.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20122) Deploy and test standalone metastore for Hive 3.1



 [ 
https://issues.apache.org/jira/browse/HIVE-20122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar resolved HIVE-20122.

Resolution: Won't Fix

> Deploy and test standalone metastore for Hive 3.1
> -
>
> Key: HIVE-20122
> URL: https://issues.apache.org/jira/browse/HIVE-20122
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> Creating a blocker JIRA for 3.1 so that this does not slip under radar. This 
> jira tracks testing effort for standalone metastore for 3.1 release. I will 
> create sub-tasks if I find any issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20122) Deploy and test standalone metastore for Hive 3.1



[ 
https://issues.apache.org/jira/browse/HIVE-20122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553646#comment-16553646
 ] 

Vihang Karajgaonkar commented on HIVE-20122:


Deployed standalone-metastore using the instructions documented 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration#AdminManualMetastore3.0Administration-RunningtheMetastoreWithoutHive

Tested basic operations (create, drop, show partitions, insert, select with a 
join) using Impala to make sure non-hive clients do work as expected.

> Deploy and test standalone metastore for Hive 3.1
> -
>
> Key: HIVE-20122
> URL: https://issues.apache.org/jira/browse/HIVE-20122
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Blocker
>
> Creating a blocker JIRA for 3.1 so that this does not slip under radar. This 
> jira tracks testing effort for standalone metastore for 3.1 release. I will 
> create sub-tasks if I find any issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20117) schema changes for txn stats



[ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553642#comment-16553642
 ] 

Sergey Shelukhin commented on HIVE-20117:
-

[~vgarg] ping?

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch, HIVE-20117.02.patch, 
> HIVE-20117.03.patch, HIVE-20117.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19935) Hive WM session killed: Failed to update LLAP tasks count

2018-07-23 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553640#comment-16553640
 ] 

Prasanth Jayachandran commented on HIVE-19935:
--

+1

> Hive WM session killed: Failed to update LLAP tasks count
> -
>
> Key: HIVE-19935
> URL: https://issues.apache.org/jira/browse/HIVE-19935
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Thai Bui
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-19935.patch
>
>
> I'm getting this error with WM feature quite frequently. It causes AM 
> containers to shut down and a new one created to replace it.
> {noformat}
> 018-06-18T19:06:49,969 INFO [Thread-250] 
> monitoring.RenderStrategy$LogToFileFunction: Map 1: 313(+270)/641
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] 
> metastore.HiveMetaStore: 4: get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] HiveMetaStore.audit: 
> ugi=hive ip=unknown-ip-addr cmd=get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:50,204 INFO [pool-29-thread-1] tez.TriggerValidatorRunnable: 
> Query: hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4. Trigger { 
> name: alluxio_medium, expression: ALLUXIO_BYTES_READ >
> 6442450944, action: MOVE TO medium } violated. Current value: 7184667126. 
> Applying action.
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] tez.WorkloadManager: Queued 
> move session: 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to 
> medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Handling move session event: 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Subscribed to counters: [S3A_BYTES_READ, BYTES_READ, 
> ALLUXIO_BYTES_READ]
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] 
> tez.KillMoveTriggerActionHandler: Moved session 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c to pool medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.GuaranteedTasksAllocator: Updating 49be39e5-875c-4cfe-8601-7fe84dd57e0c 
> with 144 guaranteed tasks
> 2018-06-18T19:06:50,205 INFO [Workload management master] tez.WmEvent: Added 
> WMEvent: EventType: MOVE EventStartTimestamp: 1529348810205 elapsedTime: 0 
> wmTezSessionInfo:SessionId: 49be39e5-875c-4cfe-8601-7fe
> 84dd57e0c Pool: medium Cluster %: 30.0
> 2018-06-18T19:06:50,234 INFO [StateChangeNotificationHandler] 
> impl.ZkRegistryBase$InstanceStateChangeListener: CHILD_UPDATED for zknode 
> /user-hive/llap/workers/worker-001571
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionPool: AM for 49be39e5-875c-4cfe-8601-7fe84dd57e0c, v.1571 has 
> updated; updating [sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, qu
> eueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 
> 586277120ms, WM state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, killR
> eason=null] with an endpoint at 32769
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionState: Ignoring an outdated info update 1571: TezAmInstance 
> [49be39e5-875c-4cfe-8601-7fe84dd57e0c, host=ip-10-8-121-231.data.bazaar
> voice.com, rpcPort=33365, pluginPort=32769, token=null]
> 2018-06-18T19:06:50,323 ERROR [TaskCommunicator # 4] 
> tez.GuaranteedTasksAllocator: Failed to update guaranteed tasks count for the 
> session sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, queueName=llap, user=
> hive, doAs=false, isOpen=true, isDefault=true, expires in 586277032ms, WM 
> state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, 
> killReason=null
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.checkAndSendGuaranteedStateUpdate(LlapTaskSchedulerService.java:596)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateGuaranteedCount(LlapTaskSchedulerService.java:581)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateQuery(LlapTaskSchedulerService.java:3041)
> at 
> org.apache.hadoop.hive.llap.tezplugins.endpoint.LlapPluginServerImpl.updateQuery(LlapPluginServerImpl.java:57)
> at 
>

[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler



 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler



 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler



 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.03.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19935) Hive WM session killed: Failed to update LLAP tasks count



[ 
https://issues.apache.org/jira/browse/HIVE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553638#comment-16553638
 ] 

Sergey Shelukhin edited comment on HIVE-19935 at 7/24/18 12:51 AM:
---

[~prasanth_j] can you take a look? a small race condition if the task is 
deallocated just before we  send an update.


was (Author: sershe):
[~prasanth_j] can you take a look? a small race condition if the task is 
deallocated just before we are going to send an update.

> Hive WM session killed: Failed to update LLAP tasks count
> -
>
> Key: HIVE-19935
> URL: https://issues.apache.org/jira/browse/HIVE-19935
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Thai Bui
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-19935.patch
>
>
> I'm getting this error with WM feature quite frequently. It causes AM 
> containers to shut down and a new one created to replace it.
> {noformat}
> 018-06-18T19:06:49,969 INFO [Thread-250] 
> monitoring.RenderStrategy$LogToFileFunction: Map 1: 313(+270)/641
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] 
> metastore.HiveMetaStore: 4: get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] HiveMetaStore.audit: 
> ugi=hive ip=unknown-ip-addr cmd=get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:50,204 INFO [pool-29-thread-1] tez.TriggerValidatorRunnable: 
> Query: hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4. Trigger { 
> name: alluxio_medium, expression: ALLUXIO_BYTES_READ >
> 6442450944, action: MOVE TO medium } violated. Current value: 7184667126. 
> Applying action.
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] tez.WorkloadManager: Queued 
> move session: 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to 
> medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Handling move session event: 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Subscribed to counters: [S3A_BYTES_READ, BYTES_READ, 
> ALLUXIO_BYTES_READ]
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] 
> tez.KillMoveTriggerActionHandler: Moved session 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c to pool medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.GuaranteedTasksAllocator: Updating 49be39e5-875c-4cfe-8601-7fe84dd57e0c 
> with 144 guaranteed tasks
> 2018-06-18T19:06:50,205 INFO [Workload management master] tez.WmEvent: Added 
> WMEvent: EventType: MOVE EventStartTimestamp: 1529348810205 elapsedTime: 0 
> wmTezSessionInfo:SessionId: 49be39e5-875c-4cfe-8601-7fe
> 84dd57e0c Pool: medium Cluster %: 30.0
> 2018-06-18T19:06:50,234 INFO [StateChangeNotificationHandler] 
> impl.ZkRegistryBase$InstanceStateChangeListener: CHILD_UPDATED for zknode 
> /user-hive/llap/workers/worker-001571
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionPool: AM for 49be39e5-875c-4cfe-8601-7fe84dd57e0c, v.1571 has 
> updated; updating [sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, qu
> eueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 
> 586277120ms, WM state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, killR
> eason=null] with an endpoint at 32769
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionState: Ignoring an outdated info update 1571: TezAmInstance 
> [49be39e5-875c-4cfe-8601-7fe84dd57e0c, host=ip-10-8-121-231.data.bazaar
> voice.com, rpcPort=33365, pluginPort=32769, token=null]
> 2018-06-18T19:06:50,323 ERROR [TaskCommunicator # 4] 
> tez.GuaranteedTasksAllocator: Failed to update guaranteed tasks count for the 
> session sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, queueName=llap, user=
> hive, doAs=false, isOpen=true, isDefault=true, expires in 586277032ms, WM 
> state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, 
> killReason=null
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.checkAndSendGuaranteedStateUpdate(LlapTaskSchedulerService.java:596)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateGuaranteedCount(LlapTaskSchedulerService.java:581)
> at 
>

[jira] [Updated] (HIVE-19935) Hive WM session killed: Failed to update LLAP tasks count



 [ 
https://issues.apache.org/jira/browse/HIVE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19935:

Status: Patch Available  (was: Open)

[~prasanth_j] can you take a look? a small race condition if the task is 
deallocated just before we are going to send an update.

> Hive WM session killed: Failed to update LLAP tasks count
> -
>
> Key: HIVE-19935
> URL: https://issues.apache.org/jira/browse/HIVE-19935
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Thai Bui
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-19935.patch
>
>
> I'm getting this error with WM feature quite frequently. It causes AM 
> containers to shut down and a new one created to replace it.
> {noformat}
> 018-06-18T19:06:49,969 INFO [Thread-250] 
> monitoring.RenderStrategy$LogToFileFunction: Map 1: 313(+270)/641
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] 
> metastore.HiveMetaStore: 4: get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] HiveMetaStore.audit: 
> ugi=hive ip=unknown-ip-addr cmd=get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:50,204 INFO [pool-29-thread-1] tez.TriggerValidatorRunnable: 
> Query: hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4. Trigger { 
> name: alluxio_medium, expression: ALLUXIO_BYTES_READ >
> 6442450944, action: MOVE TO medium } violated. Current value: 7184667126. 
> Applying action.
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] tez.WorkloadManager: Queued 
> move session: 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to 
> medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Handling move session event: 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Subscribed to counters: [S3A_BYTES_READ, BYTES_READ, 
> ALLUXIO_BYTES_READ]
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] 
> tez.KillMoveTriggerActionHandler: Moved session 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c to pool medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.GuaranteedTasksAllocator: Updating 49be39e5-875c-4cfe-8601-7fe84dd57e0c 
> with 144 guaranteed tasks
> 2018-06-18T19:06:50,205 INFO [Workload management master] tez.WmEvent: Added 
> WMEvent: EventType: MOVE EventStartTimestamp: 1529348810205 elapsedTime: 0 
> wmTezSessionInfo:SessionId: 49be39e5-875c-4cfe-8601-7fe
> 84dd57e0c Pool: medium Cluster %: 30.0
> 2018-06-18T19:06:50,234 INFO [StateChangeNotificationHandler] 
> impl.ZkRegistryBase$InstanceStateChangeListener: CHILD_UPDATED for zknode 
> /user-hive/llap/workers/worker-001571
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionPool: AM for 49be39e5-875c-4cfe-8601-7fe84dd57e0c, v.1571 has 
> updated; updating [sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, qu
> eueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 
> 586277120ms, WM state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, killR
> eason=null] with an endpoint at 32769
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionState: Ignoring an outdated info update 1571: TezAmInstance 
> [49be39e5-875c-4cfe-8601-7fe84dd57e0c, host=ip-10-8-121-231.data.bazaar
> voice.com, rpcPort=33365, pluginPort=32769, token=null]
> 2018-06-18T19:06:50,323 ERROR [TaskCommunicator # 4] 
> tez.GuaranteedTasksAllocator: Failed to update guaranteed tasks count for the 
> session sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, queueName=llap, user=
> hive, doAs=false, isOpen=true, isDefault=true, expires in 586277032ms, WM 
> state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, 
> killReason=null
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.checkAndSendGuaranteedStateUpdate(LlapTaskSchedulerService.java:596)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateGuaranteedCount(LlapTaskSchedulerService.java:581)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateQuery(LlapTaskSchedulerService.java:3041)
> at 
>

[jira] [Updated] (HIVE-19935) Hive WM session killed: Failed to update LLAP tasks count



 [ 
https://issues.apache.org/jira/browse/HIVE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19935:

Attachment: HIVE-19935.patch

> Hive WM session killed: Failed to update LLAP tasks count
> -
>
> Key: HIVE-19935
> URL: https://issues.apache.org/jira/browse/HIVE-19935
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Thai Bui
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-19935.patch
>
>
> I'm getting this error with WM feature quite frequently. It causes AM 
> containers to shut down and a new one created to replace it.
> {noformat}
> 018-06-18T19:06:49,969 INFO [Thread-250] 
> monitoring.RenderStrategy$LogToFileFunction: Map 1: 313(+270)/641
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] 
> metastore.HiveMetaStore: 4: get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] HiveMetaStore.audit: 
> ugi=hive ip=unknown-ip-addr cmd=get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:50,204 INFO [pool-29-thread-1] tez.TriggerValidatorRunnable: 
> Query: hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4. Trigger { 
> name: alluxio_medium, expression: ALLUXIO_BYTES_READ >
> 6442450944, action: MOVE TO medium } violated. Current value: 7184667126. 
> Applying action.
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] tez.WorkloadManager: Queued 
> move session: 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to 
> medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Handling move session event: 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Subscribed to counters: [S3A_BYTES_READ, BYTES_READ, 
> ALLUXIO_BYTES_READ]
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] 
> tez.KillMoveTriggerActionHandler: Moved session 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c to pool medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.GuaranteedTasksAllocator: Updating 49be39e5-875c-4cfe-8601-7fe84dd57e0c 
> with 144 guaranteed tasks
> 2018-06-18T19:06:50,205 INFO [Workload management master] tez.WmEvent: Added 
> WMEvent: EventType: MOVE EventStartTimestamp: 1529348810205 elapsedTime: 0 
> wmTezSessionInfo:SessionId: 49be39e5-875c-4cfe-8601-7fe
> 84dd57e0c Pool: medium Cluster %: 30.0
> 2018-06-18T19:06:50,234 INFO [StateChangeNotificationHandler] 
> impl.ZkRegistryBase$InstanceStateChangeListener: CHILD_UPDATED for zknode 
> /user-hive/llap/workers/worker-001571
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionPool: AM for 49be39e5-875c-4cfe-8601-7fe84dd57e0c, v.1571 has 
> updated; updating [sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, qu
> eueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 
> 586277120ms, WM state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, killR
> eason=null] with an endpoint at 32769
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionState: Ignoring an outdated info update 1571: TezAmInstance 
> [49be39e5-875c-4cfe-8601-7fe84dd57e0c, host=ip-10-8-121-231.data.bazaar
> voice.com, rpcPort=33365, pluginPort=32769, token=null]
> 2018-06-18T19:06:50,323 ERROR [TaskCommunicator # 4] 
> tez.GuaranteedTasksAllocator: Failed to update guaranteed tasks count for the 
> session sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, queueName=llap, user=
> hive, doAs=false, isOpen=true, isDefault=true, expires in 586277032ms, WM 
> state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, 
> killReason=null
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.checkAndSendGuaranteedStateUpdate(LlapTaskSchedulerService.java:596)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateGuaranteedCount(LlapTaskSchedulerService.java:581)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateQuery(LlapTaskSchedulerService.java:3041)
> at 
> org.apache.hadoop.hive.llap.tezplugins.endpoint.LlapPluginServerImpl.updateQuery(LlapPluginServerImpl.java:57)
> at 
>

[jira] [Commented] (HIVE-16882) Improvements For Avro SerDe Package



[ 
https://issues.apache.org/jira/browse/HIVE-16882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553632#comment-16553632
 ] 

Hive QA commented on HIVE-16882:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} serde in master has 195 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} serde: The patch generated 1 new + 37 unchanged - 5 
fixed = 38 total (was 42) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} serde generated 0 new + 194 unchanged - 1 fixed = 
194 total (was 195) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12806/dev-support/hive-personality.sh
 |
| git revision | master / ed4fa73 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12806/yetus/diff-checkstyle-serde.txt
 |
| modules | C: serde U: serde |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12806/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improvements For Avro SerDe Package
> ---
>
> Key: HIVE-16882
> URL: https://issues.apache.org/jira/browse/HIVE-16882
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16882.1.patch, HIVE-16882.2.patch, 
> HIVE-16882.3.patch, HIVE-16882.4.patch, HIVE-16882.5.patch, 
> HIVE-16882.6.patch, HIVE-16882.7.patch, HIVE-16882.8.patch, HIVE-16882.9.patch
>
>
> # Use SLF4J parameter DEBUG logging
> # Use re-usable libraries where appropriate
> # Use enhanced for loops where appropriate
> # Fix several minor check-style error
> # Small performance enhancements in InstanceCache



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Exclude glassfish javax.el dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553621#comment-16553621
 ] 

Vihang Karajgaonkar commented on HIVE-20227:


Okay. Sounds good.

> Exclude glassfish javax.el dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare



[ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553619#comment-16553619
 ] 

Hive QA commented on HIVE-20207:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932764/HIVE-20207.05.patch

{color:green}SUCCESS:{color} +1 due to 12 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 14788 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_dynamic_partition]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_alter]
 (batchId=192)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
 (batchId=192)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12805/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12805/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12805/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932764 - PreCommit-HIVE-Build

> Vectorization: Fix NULL / Wrong Results issues in Filter / Compare
> --
>
> Key: HIVE-20207
> URL: https://issues.apache.org/jira/browse/HIVE-20207
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20207.01.patch, HIVE-20207.02.patch, 
> HIVE-20207.03.patch, HIVE-20207.04.patch, HIVE-20207.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized filter and compare.
> BUGS:
> 1) LongColLessLongColumn SIMD optimization do not work for very large 
> integers:
>  -7272907770454997143 < 8976171455044006767
>  outputVector[i] = (vector1[i] - vector2[i]) >>> 63;
>  Produces 0 instead of 1...
> Also, add DECIMAL_64 testing. Add missing DECIMAL/DECIMAL_64 Comparison and 
> IF vectorized expression classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20156) Printing Stacktrace to STDERR



 [ 
https://issues.apache.org/jira/browse/HIVE-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-20156:
--
Attachment: HIVE-20156.1.patch
Status: Patch Available  (was: Open)

> Printing Stacktrace to STDERR
> -
>
> Key: HIVE-20156
> URL: https://issues.apache.org/jira/browse/HIVE-20156
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20156.1.patch
>
>
> Class {{org.apache.hadoop.hive.ql.exec.JoinOperator}} has the following code:
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> {code}
> Do not print the stack trace to STDERR with a call to {{printStackTrace()}}.  
> Please remove that line and let the code catching the {{HiveException}} worry 
> about printing any messages through a logger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20156) Printing Stacktrace to STDERR



 [ 
https://issues.apache.org/jira/browse/HIVE-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-20156:
-

Assignee: Andrew Sherman

> Printing Stacktrace to STDERR
> -
>
> Key: HIVE-20156
> URL: https://issues.apache.org/jira/browse/HIVE-20156
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Minor
>  Labels: newbie, noob
>
> Class {{org.apache.hadoop.hive.ql.exec.JoinOperator}} has the following code:
> {code}
> } catch (Exception e) {
>   e.printStackTrace();
>   throw new HiveException(e);
> }
> {code}
> Do not print the stack trace to STDERR with a call to {{printStackTrace()}}.  
> Please remove that line and let the code catching the {{HiveException}} worry 
> about printing any messages through a logger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat



 [ 
https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-20158:
--
Attachment: HIVE-20158.1.patch
Status: Patch Available  (was: Open)

> Do Not Print StackTraces to STDERR in Base64TextOutputFormat
> 
>
> Key: HIVE-20158
> URL: https://issues.apache.org/jira/browse/HIVE-20158
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Trivial
>  Labels: newbie, noob
> Attachments: HIVE-20158.1.patch
>
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java
> {code}
>   try {
> String signatureString = 
> job.get("base64.text.output.format.signature");
> if (signatureString != null) {
>   signature = signatureString.getBytes("UTF-8");
> } else {
>   signature = new byte[0];
> }
>   } catch (UnsupportedEncodingException e) {
> e.printStackTrace();
>   }
> {code}
> The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method 
> call.  Instead, use the {{CharSet}} version of the method and it doesn't 
> throw this explicit exception so the 'try' block can simply be removed.  
> Every JVM will support UTF-8.
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset)
> https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat



 [ 
https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-20158:
-

Assignee: Andrew Sherman

> Do Not Print StackTraces to STDERR in Base64TextOutputFormat
> 
>
> Key: HIVE-20158
> URL: https://issues.apache.org/jira/browse/HIVE-20158
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Andrew Sherman
>Priority: Trivial
>  Labels: newbie, noob
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java
> {code}
>   try {
> String signatureString = 
> job.get("base64.text.output.format.signature");
> if (signatureString != null) {
>   signature = signatureString.getBytes("UTF-8");
> } else {
>   signature = new byte[0];
> }
>   } catch (UnsupportedEncodingException e) {
> e.printStackTrace();
>   }
> {code}
> The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method 
> call.  Instead, use the {{CharSet}} version of the method and it doesn't 
> throw this explicit exception so the 'try' block can simply be removed.  
> Every JVM will support UTF-8.
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset)
> https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Exclude glassfish javax.el dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553604#comment-16553604
 ] 

Vineet Garg commented on HIVE-20227:


[~vihangk1] Thanks! We don't really need to run full ptests on this patch ( I 
have already ran TestHBaseCliDriver tests locally). I intend to upload another 
patch for master where we can run the full tests.

> Exclude glassfish javax.el dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Exclude glassfish javax.el dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553602#comment-16553602
 ] 

Vihang Karajgaonkar commented on HIVE-20227:


created the profile for branch-3.1.

> Exclude glassfish javax.el dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20207) Vectorization: Fix NULL / Wrong Results issues in Filter / Compare



[ 
https://issues.apache.org/jira/browse/HIVE-20207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553599#comment-16553599
 ] 

Hive QA commented on HIVE-20207:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
22s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 14m 
12s{color} | {color:red} branch/itests/hive-jmh cannot run convertXmlToText 
from findbugs {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
19s{color} | {color:red} hive-jmh in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
27s{color} | {color:red} ql in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hive-jmh in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 19s{color} 
| {color:red} hive-jmh in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} itests/hive-jmh: The patch generated 1 new + 20 
unchanged - 0 fixed = 21 total (was 20) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 86 new + 1036 unchanged - 85 
fixed = 1122 total (was 1121) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} vector-code-gen: The patch generated 3 new + 319 
unchanged - 0 fixed = 322 total (was 319) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
16s{color} | {color:red} hive-jmh in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
15s{color} | {color:red} ql generated 17 new + 2280 unchanged - 0 fixed = 2297 
total (was 2280) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of undecoratedTypeName, which is known to be non-null 
in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getDecimal64VectorExpressionForUdf(GenericUDF,
 Class, List, int, VectorExpressionDescriptor$Mode, TypeInfo)  Redundant null 
check at VectorizationContext.java:is known to be non-null in 
org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getDecimal64VectorExpressionForUdf(GenericUDF,
 Class, List, int, VectorExpressionDescriptor$Mode, TypeInfo)  Redundant null 
check at VectorizationContext.java:[line 1595] |
|  |  Class 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColEqualDecimalScalar
 defines non-transient non-serializable instance field value  In 
DecimalColEqualDecimalScalar.java:instance field value  In 
DecimalColEqualDecimalScalar.java |
|  |  Class 
org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColGreaterDecimalScalar

[jira] [Commented] (HIVE-20227) Exclude glassfish javax.el dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553595#comment-16553595
 ] 

Vihang Karajgaonkar commented on HIVE-20227:


the patch name is wrong and precommit will fail. Also, we don't have a profile 
for branch-3.1 yet as far as I know. Let me create it

> Exclude glassfish javax.el dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19766) Show the number of rows inserted when execution engine is Spark



[ 
https://issues.apache.org/jira/browse/HIVE-19766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553593#comment-16553593
 ] 

Sahil Takiar commented on HIVE-19766:
-

The general approach here LGTM

> Show the number of rows inserted when execution engine is Spark
> ---
>
> Key: HIVE-19766
> URL: https://issues.apache.org/jira/browse/HIVE-19766
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19766.1.patch
>
>
> Currently when insert query is run, the beeline output shows No rows affected.
> The logic to show the number of rows inserted is now present when execution 
> engine is MR.
> This Jira is to make this work with Spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17683) Add explain locks command

2018-07-23 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17683:
--
Release Note: 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain#LanguageManualExplain-TneLOCKSClause

> Add explain locks  command
> ---
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch, 
> HIVE-17683.03.patch, HIVE-17683.04.patch, HIVE-17683.05.patch, 
> HIVE-17683.06.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19935) Hive WM session killed: Failed to update LLAP tasks count



 [ 
https://issues.apache.org/jira/browse/HIVE-19935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19935:
---

Assignee: Sergey Shelukhin  (was: Thai Bui)

> Hive WM session killed: Failed to update LLAP tasks count
> -
>
> Key: HIVE-19935
> URL: https://issues.apache.org/jira/browse/HIVE-19935
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
>Reporter: Thai Bui
>Assignee: Sergey Shelukhin
>Priority: Minor
>
> I'm getting this error with WM feature quite frequently. It causes AM 
> containers to shut down and a new one created to replace it.
> {noformat}
> 018-06-18T19:06:49,969 INFO [Thread-250] 
> monitoring.RenderStrategy$LogToFileFunction: Map 1: 313(+270)/641
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] 
> metastore.HiveMetaStore: 4: get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:49,988 INFO [NotificationEventPoll 0] HiveMetaStore.audit: 
> ugi=hive ip=unknown-ip-addr cmd=get_config_value: 
> name=metastore.batch.retrieve.max defaultValue=50
> 2018-06-18T19:06:50,204 INFO [pool-29-thread-1] tez.TriggerValidatorRunnable: 
> Query: hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4. Trigger { 
> name: alluxio_medium, expression: ALLUXIO_BYTES_READ >
> 6442450944, action: MOVE TO medium } violated. Current value: 7184667126. 
> Applying action.
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] tez.WorkloadManager: Queued 
> move session: 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to 
> medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Processing current events
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Handling move session event: 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c moving from default to medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.WorkloadManager: Subscribed to counters: [S3A_BYTES_READ, BYTES_READ, 
> ALLUXIO_BYTES_READ]
> 2018-06-18T19:06:50,205 INFO [pool-29-thread-1] 
> tez.KillMoveTriggerActionHandler: Moved session 
> 49be39e5-875c-4cfe-8601-7fe84dd57e0c to pool medium
> 2018-06-18T19:06:50,205 INFO [Workload management master] 
> tez.GuaranteedTasksAllocator: Updating 49be39e5-875c-4cfe-8601-7fe84dd57e0c 
> with 144 guaranteed tasks
> 2018-06-18T19:06:50,205 INFO [Workload management master] tez.WmEvent: Added 
> WMEvent: EventType: MOVE EventStartTimestamp: 1529348810205 elapsedTime: 0 
> wmTezSessionInfo:SessionId: 49be39e5-875c-4cfe-8601-7fe
> 84dd57e0c Pool: medium Cluster %: 30.0
> 2018-06-18T19:06:50,234 INFO [StateChangeNotificationHandler] 
> impl.ZkRegistryBase$InstanceStateChangeListener: CHILD_UPDATED for zknode 
> /user-hive/llap/workers/worker-001571
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionPool: AM for 49be39e5-875c-4cfe-8601-7fe84dd57e0c, v.1571 has 
> updated; updating [sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, qu
> eueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 
> 586277120ms, WM state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, killR
> eason=null] with an endpoint at 32769
> 2018-06-18T19:06:50,235 INFO [StateChangeNotificationHandler] 
> tez.TezSessionState: Ignoring an outdated info update 1571: TezAmInstance 
> [49be39e5-875c-4cfe-8601-7fe84dd57e0c, host=ip-10-8-121-231.data.bazaar
> voice.com, rpcPort=33365, pluginPort=32769, token=null]
> 2018-06-18T19:06:50,323 ERROR [TaskCommunicator # 4] 
> tez.GuaranteedTasksAllocator: Failed to update guaranteed tasks count for the 
> session sessionId=49be39e5-875c-4cfe-8601-7fe84dd57e0c, queueName=llap, user=
> hive, doAs=false, isOpen=true, isDefault=true, expires in 586277032ms, WM 
> state poolName=medium, clusterFraction=0.3, 
> queryId=hive_20180618190637_e65869b8-10be-4880-a8d3-84989bd055b4, 
> killReason=null
> com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.checkAndSendGuaranteedStateUpdate(LlapTaskSchedulerService.java:596)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateGuaranteedCount(LlapTaskSchedulerService.java:581)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskSchedulerService.updateQuery(LlapTaskSchedulerService.java:3041)
> at 
> org.apache.hadoop.hive.llap.tezplugins.endpoint.LlapPluginServerImpl.updateQuery(LlapPluginServerImpl.java:57)
> at 
> org.apache.hadoop.hive.llap.plugin.rpc.LlapPluginProtocolProtos$LlapPluginProtocol$2.callBlockingMethod(LlapPluginProtocolProtos.java:835)
> at 
>

[jira] [Commented] (HIVE-20167) apostrophe in midline comment fails with ParseException



[ 
https://issues.apache.org/jira/browse/HIVE-20167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553569#comment-16553569
 ] 

Andrew Sherman commented on HIVE-20167:
---

I tried to reproduce with

{{create table andrew (}}
{{a int -- here's a comment}}
{{);}}

but that seems to work OK. 

> apostrophe in midline comment fails with ParseException
> ---
>
> Key: HIVE-20167
> URL: https://issues.apache.org/jira/browse/HIVE-20167
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.3.2
> Environment: Observed on an AWS EMR cluster. 
> Hive cli, executing script from bash with "hive -f ..." (not interactive).
>  
>Reporter: Trey Fore
>Priority: Minor
>
> This line causes a ParseException:
> {{    , member_id string                  --  standardizing from client's 
> memberID}}
> When the apostrophe is removed, leaving:
> {{    , member_id string                  --  standardizing from clients 
> memberID}}
> the line is parsed correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler



[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553563#comment-16553563
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Will get to this later today or tomorrow.


On Mon, Jul 23, 2018 at 3:50 PM, Sahil Takiar (JIRA) 



> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19733) RemoteSparkJobStatus#getSparkStageProgress inefficient implementation



 [ 
https://issues.apache.org/jira/browse/HIVE-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-19733:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> RemoteSparkJobStatus#getSparkStageProgress inefficient implementation
> -
>
> Key: HIVE-19733
> URL: https://issues.apache.org/jira/browse/HIVE-19733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19733.1.patch, HIVE-19733.2.patch, 
> HIVE-19733.3.patch
>
>
> The implementation of {{RemoteSparkJobStatus#getSparkStageProgress}} is a bit 
> inefficient. There is one RPC call to get the {{SparkJobInfo}} and then for 
> every stage there is another RPC call to get each {{SparkStageInfo}}. This 
> could all be done in a single RPC call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries

[
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Shelukhin updated HIVE-19416:

Description:
This adds accurate stats support to Hive transactional (insert-only and full
ACID) tables, so that some queries from these tables can be answered from stats
and also so that the stats could be used for more optimization. This support
can be enabled via a config flag, and is on by default.

This is achieved via the following changes, that basically start us on the path
of treating ACID stats the same way we treat ACID data:
In addition to existing JSON blob, we store a write ID of the latest stats
writer with each table and partition. Any writer updating the stats or altering
the stats state for a txn table has to record his write ID - if the write ID
and some other context is not provided by the caller of alter, the alter
table/partition operation fails. It's the responsibility of the writer to not
commit its transaction if the operation fails.
In future, we'd like to move the stats state into actual stats tables, but for
now it's (logically) colocated with the existing json parameter.

In addition to its write ID, most callers (with the exception of the ones that
cannot have races, e.g. create table) provide their own txn state (write ID
list) for the table. The existing stats' write ID is verified against this
state. If the write ID is not visible to the updater, we still update the
stats, but set the stats state to invalid; that basically means that two
parallel operations that cannot see each others' data output are updating the
stats.

This way, txn stats stay valid as the result of a sequence of non-conflicting
updates that can all see each other and account for each others' data for the
stats. Any parallel updates invalidate the stats.
This is necessary because unlike data, stats are a single version summary of
the table. To be able to support parallel operations with valid stats, each
stats update would need to write a separate record, and would also need to
write mergeable records that only reflect its own changes, instead of the final
view of the table stats (two of those are hard or impossible to merge).

This approach resulted in a few changes to alter/etc APIs in metastore; it also
requires that many alter operations, as well as analyze table, allocate a write
ID (because they affect stats-that-are-treated-like-data, and so are
essentially a write operation).

The reader, in turn, verifies that the stats are valid and written by a write
ID that is itself valid given the reader's transactional state (i.e. not
aborted, nor in progress). This is done on metastore side; if the stats are
invalid for the reader, we transparently update the stats state returned to the
caller to mark the stats as inaccurate.

We've considered (and actually implemented) an alternative approach of
recording the full txn state of the stats writer to be compared with the state
of the stats reader (to see if they are compatible and avoid the extra write
IDs and strict write-time checks), however it results in problems with
partitioned tables, where not all writes affect all partitions, and so the
stats state of all the untouched partitions becomes invalid once a subset of
partitions is updated (because we cannot tell whether the write ID, a table
level operation, didn't touch the partition, or did touch it but didn't record
the stats). Additionally, storing full txn state for every partition and table
can be expensive, especially in extreme cases where the watermark doesn't
advance for a while for some reason.

was:
This adds accurate stats support to Hive transactional (insert-only and full
ACID) tables, so that some queries from these tables can be answered from stats
and also so that the stats could be used for more optimization. This support
can be enabled via a config flag, and is on by default.

[jira] [Commented] (HIVE-19733) RemoteSparkJobStatus#getSparkStageProgress inefficient implementation



[ 
https://issues.apache.org/jira/browse/HIVE-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553552#comment-16553552
 ] 

Sahil Takiar commented on HIVE-19733:
-

+1 LGTM

> RemoteSparkJobStatus#getSparkStageProgress inefficient implementation
> -
>
> Key: HIVE-19733
> URL: https://issues.apache.org/jira/browse/HIVE-19733
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19733.1.patch, HIVE-19733.2.patch, 
> HIVE-19733.3.patch
>
>
> The implementation of {{RemoteSparkJobStatus#getSparkStageProgress}} is a bit 
> inefficient. There is one RPC call to get the {{SparkJobInfo}} and then for 
> every stage there is another RPC call to get each {{SparkStageInfo}}. This 
> could all be done in a single RPC call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries



[ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553544#comment-16553544
 ] 

Sergey Shelukhin commented on HIVE-19416:
-

Updated the description

> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>
> This adds accurate stats support to Hive transactional (insert-only and full 
> ACID) tables, so that some queries from these tables can be answered from 
> stats and also so that the stats could be used for more optimization. This 
> support can be enabled via a config flag, and is on by default.
> This is achieved via the following changes, that basically start us on the 
> path of treating ACID stats the same way we treat ACID data:
> In addition to existing JSON blob, we store a write ID of the latest stats 
> writer with each table and partition. Any writer updating the stats or 
> altering the stats state for a txn table has to record his write ID - if the 
> write ID and some other context is not provided by the caller of alter, the 
> alter table/partition operation fails. It's the responsibility of the writer 
> to not commit its transaction if the operation fails.
> In future, we'd like to move the stats state into actual stats tables, but 
> for now it's (logically) colocated with the existing json parameter.
> In addition to its write ID, most callers (with the exception of the ones 
> that cannot have races, e.g. create table) provide their own txn state (write 
> ID list) for the table. The existing stats' write ID is verified against this 
> state. If the write ID is not visible to the updater, we still update the 
> stats, but set the stats state to invalid; that basically means that two 
> parallel operations that cannot see each others' data output are updating the 
> stats.
> This way, txn stats stay valid as the result of a sequence of non-conflicting 
> updates that can all see each other and account for each others' data for the 
> stats. Any parallel updates invalidate the stats.
> This is necessary because unlike data, stats are a single version summary of 
> the table. To be able to support parallel operations with valid stats, each 
> stats update would need to write a separate record, and would also need to 
> write mergeable records that only reflect its own changes, instead of the 
> final view of the table stats (two of those are hard or impossible to merge).
> This approach resulted in a few changes to alter/etc APIs in metastore; it 
> also requires that many alter operations, as well as analyze table, allocate 
> a write ID (because they affect stats-that-are-treated-like-data, and so are 
> essentially a write operation).
> The reader, in turn, verifies that the stats are valid and written by a write 
> ID that is itself valid given the reader's transactional state (i.e. not 
> aborted, nor in progress).
> We've considered (and actually implemented) an alternative approach of 
> recording the full txn state of the stats writer to be compared with the 
> state of the stats reader (to see if they are compatible and avoid the extra 
> write IDs and strict write-time checks), however it results in problems with 
> partitioned tables, where not all writes affect all partitions, and so the 
> stats state of all the untouched partitions becomes invalid once a subset of 
> partitions is updated (because we cannot tell whether the write ID, a table 
> level operation, didn't touch the partition, or did touch it but didn't 
> record the stats). Additionally, storing full txn state for every partition 
> and table can be expensive, especially in extreme cases where the watermark 
> doesn't advance for a while for some reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries



 [ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19416:

Description: 
This adds accurate stats support to Hive transactional (insert-only and full 
ACID) tables, so that some queries from these tables can be answered from stats 
and also so that the stats could be used for more optimization. This support 
can be enabled via a config flag, and is on by default.

This is achieved via the following changes, that basically start us on the path 
of treating ACID stats the same way we treat ACID data:
In addition to existing JSON blob, we store a write ID of the latest stats 
writer with each table and partition. Any writer updating the stats or altering 
the stats state for a txn table has to record his write ID - if the write ID 
and some other context is not provided by the caller of alter, the alter 
table/partition operation fails. It's the responsibility of the writer to not 
commit its transaction if the operation fails.
In future, we'd like to move the stats state into actual stats tables, but for 
now it's (logically) colocated with the existing json parameter.

In addition to its write ID, most callers (with the exception of the ones that 
cannot have races, e.g. create table) provide their own txn state (write ID 
list) for the table. The existing stats' write ID is verified against this 
state. If the write ID is not visible to the updater, we still update the 
stats, but set the stats state to invalid; that basically means that two 
parallel operations that cannot see each others' data output are updating the 
stats.

This way, txn stats stay valid as the result of a sequence of non-conflicting 
updates that can all see each other and account for each others' data for the 
stats. Any parallel updates invalidate the stats.
This is necessary because unlike data, stats are a single version summary of 
the table. To be able to support parallel operations with valid stats, each 
stats update would need to write a separate record, and would also need to 
write mergeable records that only reflect its own changes, instead of the final 
view of the table stats (two of those are hard or impossible to merge).

This approach resulted in a few changes to alter/etc APIs in metastore; it also 
requires that many alter operations, as well as analyze table, allocate a write 
ID (because they affect stats-that-are-treated-like-data, and so are 
essentially a write operation).

The reader, in turn, verifies that the stats are valid and written by a write 
ID that is itself valid given the reader's transactional state (i.e. not 
aborted, nor in progress).

We've considered (and actually implemented) an alternative approach of 
recording the full txn state of the stats writer to be compared with the state 
of the stats reader (to see if they are compatible and avoid the extra write 
IDs and strict write-time checks), however it results in problems with 
partitioned tables, where not all writes affect all partitions, and so the 
stats state of all the untouched partitions becomes invalid once a subset of 
partitions is updated (because we cannot tell whether the write ID, a table 
level operation, didn't touch the partition, or did touch it but didn't record 
the stats). Additionally, storing full txn state for every partition and table 
can be expensive, especially in extreme cases where the watermark doesn't 
advance for a while for some reason.




  was:The system should use only statistics for aggregation queries like count 
on transactional tables.


> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>
> This adds accurate stats support to Hive transactional (insert-only and full 
> ACID) tables, so that some queries from these tables can be answered from 
> stats and also so that the stats could be used for more optimization. This 
> support can be enabled via a config flag, and is on by default.
> This is achieved via the following changes, that basically start us on the 
> path of treating ACID stats the same way we treat ACID data:
> In addition to existing JSON blob, we store a write ID of the latest stats 
> writer with each table and partition. Any writer updating the stats or 
> altering the stats state for a txn table has to record his write ID - if the 
> write ID and some other context is not provided by the caller of alter, the 
> alter table/partition operation fails. It's the responsibility of the writer 
> to not commit its

[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization



[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553538#comment-16553538
 ] 

Sahil Takiar commented on HIVE-19937:
-

Thanks Misha, yes I will keep this in mind. I don't think this part of the code 
is CPU bound, so it shouldn't be an issue.

> Intern fields in MapWork on deserialization
> ---
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, 
> HIVE-19937.3.patch, HIVE-19937.4.patch, HIVE-19937.5.patch, 
> post-patch-report.html, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version



[ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553536#comment-16553536
 ] 

Hive QA commented on HIVE-20164:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932763/HIVE-20164.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14683 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12804/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12804/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12804/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932763 - PreCommit-HIVE-Build

> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.1.patch, HIVE-20164.2.patch, 
> HIVE-20164.3.patch, HIVE-20164.4.patch, HIVE-20164.5.patch, 
> HIVE-20164.6.patch, HIVE-20164.7.patch, HIVE-20164.8.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19986) Add logging of runtime statistics indicating when Hdfs Erasure Coding is used by MR



 [ 
https://issues.apache.org/jira/browse/HIVE-19986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-19986:
--
Attachment: HIVE-19986.1.patch
Status: Patch Available  (was: Open)

> Add logging of runtime statistics indicating when Hdfs Erasure Coding is used 
> by MR
> ---
>
> Key: HIVE-19986
> URL: https://issues.apache.org/jira/browse/HIVE-19986
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-19986.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries



[ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553522#comment-16553522
 ] 

Sergey Shelukhin commented on HIVE-19416:
-

[~vgarg] design is code ;)
Let me try to add release notes.

> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>
> The system should use only statistics for aggregation queries like count on 
> transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries



[ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553522#comment-16553522
 ] 

Sergey Shelukhin edited comment on HIVE-19416 at 7/23/18 11:10 PM:
---

[~vgarg] design is code ;)
Let me try to add release notes. We can write something up that is larger than 
this, not sure if it's worth it.


was (Author: sershe):
[~vgarg] design is code ;)
Let me try to add release notes.

> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
>
> The system should use only statistics for aggregation queries like count on 
> transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19891) inserting into external tables with custom partition directories may cause data loss



 [ 
https://issues.apache.org/jira/browse/HIVE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19891:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!


> inserting into external tables with custom partition directories may cause 
> data loss
> 
>
> Key: HIVE-19891
> URL: https://issues.apache.org/jira/browse/HIVE-19891
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19891.01.patch, HIVE-19891.02.patch, 
> HIVE-19891.03.patch, HIVE-19891.04.patch, HIVE-19891.05.patch, 
> HIVE-19891.06.patch, HIVE-19891.07.patch, HIVE-19891.patch
>
>
> tbl1 is just used as a prop to create data, could be an existing directory 
> for an external table.
> Due to weird behavior of LoadTableDesc (some ancient code for overriding old 
> partition path), custom partition path is overwritten after the query and the 
> data in it ceases being a part of the table (can be seen in desc formatted 
> output with masking commented out in QTestUtil)
> This affects branch-1 too, so it's pretty old.
> {noformat}drop table tbl1;
> CREATE TABLE tbl1 (index int, value int ) PARTITIONED BY ( created_date 
> string );
> insert into tbl1 partition(created_date='2018-02-01') VALUES (2, 2);
> CREATE external TABLE tbl2 (index int, value int ) PARTITIONED BY ( 
> created_date string );
> ALTER TABLE tbl2 ADD PARTITION(created_date='2018-02-01');
> ALTER TABLE tbl2 PARTITION(created_date='2018-02-01') SET LOCATION 
> 'file:/Users/sergey/git/hivegit/itests/qtest/target/warehouse/tbl1/created_date=2018-02-01';
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> insert into tbl2 partition(created_date='2018-02-01') VALUES (1, 1);
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Exclude glassfish javax.el dependecyn



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Summary: Exclude glassfish javax.el dependecyn  (was: Exclude glassfish 
javax.el dependecy)

> Exclude glassfish javax.el dependecyn
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Exclude glassfish javax.el dependecy



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Summary: Exclude glassfish javax.el dependecy  (was: Hive 3.1 rc0 failing 
with snapshot dependency)

> Exclude glassfish javax.el dependecy
> 
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Exclude glassfish javax.el dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Summary: Exclude glassfish javax.el dependency  (was: Exclude glassfish 
javax.el dependecyn)

> Exclude glassfish javax.el dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency

2018-07-23 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553509#comment-16553509
 ] 

Ashutosh Chauhan commented on HIVE-20227:
-

+1

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553506#comment-16553506
 ] 

Ashutosh Chauhan commented on HIVE-20082:
-

+1

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version



[ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553505#comment-16553505
 ] 

Hive QA commented on HIVE-20164:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  5m 
50s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
56s{color} | {color:red} ql: The patch generated 2 new + 39 unchanged - 0 fixed 
= 41 total (was 39) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12804/dev-support/hive-personality.sh
 |
| git revision | master / 5e7aa09 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12804/yetus/diff-checkstyle-ql.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12804/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.1.patch, HIVE-20164.2.patch, 
> HIVE-20164.3.patch, HIVE-20164.4.patch, HIVE-20164.5.patch, 
> HIVE-20164.6.patch, HIVE-20164.7.patch, HIVE-20164.8.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19532) merge master-txnstats branch



[ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553503#comment-16553503
 ] 

Hive QA commented on HIVE-19532:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
55s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
26s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
29s{color} | {color:blue} standalone-metastore/metastore-common in master has 9 
extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
23s{color} | {color:blue} ql in master has 2280 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
14s{color} | {color:red} metastore-server in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 
36s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} storage-api: The patch generated 1 new + 3 unchanged - 
0 fixed = 4 total (was 3) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
58s{color} | {color:red} root: The patch generated 1 new + 426 unchanged - 0 
fixed = 427 total (was 426) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
23s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 27 
unchanged - 1 fixed = 29 total (was 28) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
39s{color} | {color:red} ql: The patch generated 45 new + 2684 unchanged - 18 
fixed = 2729 total (was 2702) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
2s{color} | {color:red} The patch has 419 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
36s{color} | {color:red} patch/storage-api cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} patch/common cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
52s{color} | {color:red} patch/standalone-metastore/metastore-common cannot run 
setBugDatabaseInfo from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
48s{color} | {color:red} patch/itests/hive-unit cannot run setBugDatabaseInfo 
from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  6m 
17s{color} | {color:red} patch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
26s{color} | {color:red} metastore-server

[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler



[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553504#comment-16553504
 ] 

Sahil Takiar commented on HIVE-17684:
-

[~mi...@cloudera.com] I would like to re-visit this patch now that Hive has 
upgraded to Hadoop 3.1.0. This is still an issue for HoS. If you are still 
interested in moving this patch forward, can you re-base it and re-attach an 
updated patch so we get a new run of Hive QA. If you are too busy, let me know 
and I can assign it to myself.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553502#comment-16553502
 ] 

Vineet Garg commented on HIVE-20227:


Ran TestHBaseCliDriver locally to confirm that there are no failures.

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553489#comment-16553489
 ] 

Jason Dere commented on HIVE-20082:
---

Ok, I think I've updated the patch correctly

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



 [ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20082:
--
Attachment: HIVE-20082.4.patch

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



 [ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20082:
--
Attachment: (was: HIVE-20082.4.patch)

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20226) query.setRange(0, maxEvents) in ObjectStore will throw exception when maxEvents exceed 50M



 [ 
https://issues.apache.org/jira/browse/HIVE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan reassigned HIVE-20226:


Assignee: Alice Fan

> query.setRange(0, maxEvents) in ObjectStore will throw exception when 
> maxEvents exceed 50M
> --
>
> Key: HIVE-20226
> URL: https://issues.apache.org/jira/browse/HIVE-20226
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Alice Fan
>Assignee: Alice Fan
>Priority: Major
>
> query.setRange(0, maxEvents) in ObjectStore will throw exception when 
> maxEvents exceed 50M.
> java.sql.SQLException: setMaxRows() out of range. 2147483647 > 5000.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions



[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553485#comment-16553485
 ] 

Sahil Takiar commented on HIVE-20032:
-

[~lirui] HoS tests are passing now, and I've managed to preserve the Kryo 
shading. Created an RB: https://reviews.apache.org/r/68026/ - could you take a 
look?

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553486#comment-16553486
 ] 

Jason Dere commented on HIVE-20082:
---

Oh, if you mean I accidentally included the changes from HIVE-20204 as well, 
that might be the case, let me check.

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19532) merge master-txnstats branch



 [ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19532:

Attachment: HIVE-19532.25.patch

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.23.patch, HIVE-19532.24.patch, 
> HIVE-19532.25.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553484#comment-16553484
 ] 

Jason Dere commented on HIVE-20082:
---

Which differences are you referring to? I think any changes in this patch have 
to do with the change in decimal-to-string conversion, so either extra 
0-padding or change in the string cast function name.

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19532) merge master-txnstats branch



 [ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19532:

Attachment: (was: HIVE-19532.22.patch)

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.23.patch, HIVE-19532.24.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions



 [ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-20032:

Summary: Don't serialize hashCode for repartitionAndSortWithinPartitions  
(was: Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled)

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553480#comment-16553480
 ] 

Vineet Garg commented on HIVE-20227:


[~ashutoshc] Can you take a look?

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



[ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553479#comment-16553479
 ] 

Vineet Garg commented on HIVE-20227:


Since hive doesn't seem to require glassfish:javax.el it could be excluded.

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553477#comment-16553477
 ] 

Ashutosh Chauhan commented on HIVE-20082:
-

Looks like latest patch contains diffs from HIVE-20204 . Incorrectly generated 
patch?

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Attachment: HIVE-20227.branch-3.1-1.patch

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Description: 
INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
hive-llap-server ---

[WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
failed with message:

Release builds are not allowed to have SNAPSHOT depenendencies

Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT


> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20227.branch-3.1-1.patch
>
>
> INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @
> hive-llap-server ---
> [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireReleaseDeps
> failed with message:
> Release builds are not allowed to have SNAPSHOT depenendencies
> Found Banned Dependency: org.glassfish:javax.el:jar:3.0.1-b11-SNAPSHOT



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-20227:
--


> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Component/s: Hive

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20227) Hive 3.1 rc0 failing with snapshot dependency



 [ 
https://issues.apache.org/jira/browse/HIVE-20227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20227:
---
Affects Version/s: 3.1.0

> Hive 3.1 rc0 failing with snapshot dependency
> -
>
> Key: HIVE-20227
> URL: https://issues.apache.org/jira/browse/HIVE-20227
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19532) merge master-txnstats branch



[ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553464#comment-16553464
 ] 

Sergey Shelukhin commented on HIVE-19532:
-

The changes to this test are cosmetic (incorrect results changes are gone). I 
will update it.
[~hagleitn] [~ekoifman] can I get two committer reviews for the branch merge? 
I'm assuming my +1 counts as the 3rd given that code was collective effort.

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.22.patch, HIVE-19532.23.patch, 
> HIVE-19532.24.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19532) merge master-txnstats branch



[ 
https://issues.apache.org/jira/browse/HIVE-19532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553451#comment-16553451
 ] 

Hive QA commented on HIVE-19532:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12932757/HIVE-19532.24.patch

{color:green}SUCCESS:{color} +1 due to 26 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14701 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=173)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12803/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12803/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12803/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12932757 - PreCommit-HIVE-Build

> merge master-txnstats branch
> 
>
> Key: HIVE-19532
> URL: https://issues.apache.org/jira/browse/HIVE-19532
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19532.01.patch, HIVE-19532.01.prepatch, 
> HIVE-19532.02.patch, HIVE-19532.02.prepatch, HIVE-19532.03.patch, 
> HIVE-19532.04.patch, HIVE-19532.05.patch, HIVE-19532.07.patch, 
> HIVE-19532.08.patch, HIVE-19532.09.patch, HIVE-19532.10.patch, 
> HIVE-19532.11.patch, HIVE-19532.12.patch, HIVE-19532.13.patch, 
> HIVE-19532.14.patch, HIVE-19532.15.patch, HIVE-19532.16.patch, 
> HIVE-19532.19.patch, HIVE-19532.22.patch, HIVE-19532.23.patch, 
> HIVE-19532.24.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20169) Print Final Rows Processed in MapOperator

2018-07-23 Thread BELUGA BEHR (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553447#comment-16553447
 ] 

BELUGA BEHR commented on HIVE-20169:


[~bharos92] Can you please update the logging to print the value of the 'abort' 
flag as well? Sorry.  Thanks.

> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20082) HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Bharathkrishna Guruvayoor Murali (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-20082:
--
Attachment: HIVE-20082.4.patch

> HiveDecimal to string conversion doesn't format the decimal correctly - master
> --
>
> Key: HIVE-20082
> URL: https://issues.apache.org/jira/browse/HIVE-20082
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-20082.1.patch, HIVE-20082.2.patch, 
> HIVE-20082.3.patch, HIVE-20082.4.patch
>
>
> Example: LPAD on a decimal(7,1) values of 0 returns "0" (plus padding) but it 
> should be "0.0" (plus padding)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20169) Print Final Rows Processed in MapOperator



[ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553420#comment-16553420
 ] 

Bharathkrishna Guruvayoor Murali commented on HIVE-20169:
-

Updating the logger with parameterized logging.

> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20169) Print Final Rows Processed in MapOperator

2018-07-23 Thread Bharathkrishna Guruvayoor Murali (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-20169:

Attachment: HIVE-20169.2.patch

> Print Final Rows Processed in MapOperator
> -
>
> Key: HIVE-20169
> URL: https://issues.apache.org/jira/browse/HIVE-20169
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HIVE-20169.1.patch, HIVE-20169.2.patch
>
>
> https://github.com/apache/hive/blob/ac6b2a3fb195916e22b2e5f465add2ffbcdc7430/ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java#L573-L582
> This class emits a log message every time it a certain number of records are 
> processed, but it does not print a final count.
> Overload the {{MapOperator}} class's {{closeOp}} method to print a final log 
> message providing the total number of rows read by this mapper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20225) SerDe to support Teradata Binary Format

2018-07-23 Thread Lu Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Li reassigned HIVE-20225:



> SerDe to support Teradata Binary Format
> ---
>
> Key: HIVE-20225
> URL: https://issues.apache.org/jira/browse/HIVE-20225
> Project: Hive
>  Issue Type: New Feature
>  Components: Serializers/Deserializers
>Reporter: Lu Li
>Assignee: Lu Li
>Priority: Major
>
> When using TPT/BTEQ to export Data from Teradata, Teradata will export binary 
> files based on the schema.
> A Customized SerDe is needed in order to directly read these files from Hive.
> {code:java}
> CREATE EXTERNAL TABLE `TABLE1`(
> ...)
> PARTITIONED BY (
> ...)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.contrib.serde2.TeradataBinarySerde'
> STORED AS INPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileInputFormat'
> OUTPUTFORMAT
>  
> 'org.apache.hadoop.hive.contrib.fileformat.teradata.TeradataBinaryFileOutputFormat'
> LOCATION ...;
> SELECT * FROM `TABLE1`;{code}
> Problem Statement:
> Right now the fast way to export data from Teradata is using TPT. However, 
> the Hive could not directly utilize these exported binary format because it 
> doesn't have a SerDe for these files.
> Result:
> Provided with the SerDe, Hive can operate upon the exported Teradata Binary 
> Format file transparently.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18929) The method humanReadableInt in HiveStringUtils.java has a race condition.