date:20180712

[jira] [Commented] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542554#comment-16542554
 ] 

Hive QA commented on HIVE-19829:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931333/HIVE-19829.11-branch-3.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 96 failed/errored test(s), 14400 tests 
executed
*Failed tests:*
{noformat}
TestAddPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestAddPartitionsFromPartSpec - did not produce a TEST-*.xml file (likely timed 
out) (batchId=215)
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestAggregateStatsCache - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestAlterPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestAppendPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=271)
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestCatalogNonDefaultClient - did not produce a TEST-*.xml file (likely timed 
out) (batchId=213)
TestCatalogNonDefaultSvr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestCatalogOldClient - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestCatalogs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestCheckConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=223)
TestDatabases - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestDefaultConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestDropPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=271)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=216)
TestExchangePartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestFMSketchSerialization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=223)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestForeignKey - did not produce a TEST-*.xml file (likely timed out) 
(batchId=215)
TestFunctions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestGetPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestGetTableMeta - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestHLLNoBias - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHLLSerialization - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHdfsUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestHiveAlterHandler - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=221)
TestHiveMetaStorePartitionSpecs - did not produce a TEST-*.xml file (likely 
timed out) (batchId=215)
TestHiveMetaStoreSchemaMethods - did not produce a TEST-*.xml file (likely 
timed out) (batchId=221)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=218)
TestHiveMetastoreCli - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestHyperLogLog - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHyperLogLogDense - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHyperLogLogMerge - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestHyperLogLogSparse - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestJSONMessageDeserializer - did not produce a TEST-*.xml file (likely timed 
out) (batchId=221)
TestListPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestLockRequestBuilder - did not produce a TEST-*.xml file (likely timed out) 
(batchId=213)
TestMarkPartition - did not produce a TEST-*.xml file (likely timed out) 
(batchId=221)
TestMarkPartitionRemote - did not produce a TEST-*.xml file (likely timed out) 
(batchId=223)
TestMetaStoreConnectionUrlHook - did not produce a TEST-*.xml file

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542550#comment-16542550
 ] 

Gopal V commented on HIVE-20153:


>From a quick look, it looks like they are hashmaps with 0 items.

{code}
@Override
public void reset(AggregationBuffer agg) throws HiveException {
  ((CountAgg) agg).value = 0;
  ((CountAgg) agg).uniqueObjects = new HashSet();
}
{code}

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19820:

Attachment: HIVE-19820.03.patch

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, 
> HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, 
> branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19820:

Attachment: (was: HIVE-19820.03.patch)

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, 
> HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, 
> branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542523#comment-16542523
 ] 

Sergey Shelukhin commented on HIVE-19820:
-

Fixed the directsql issue that affects partitioned views.

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, 
> HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, 
> branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542510#comment-16542510
 ] 

Hive QA commented on HIVE-19829:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 13s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-12573/patches/PreCommit-HIVE-Build-12573.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12573/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch
>
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20135) Fix incompatible change in TimestampColumnVector to default to UTC

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542509#comment-16542509
 ] 

Hive QA commented on HIVE-20135:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931330/HIVE-20135.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14650 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=248)
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty (batchId=248)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=250)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12572/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12572/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12572/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931330 - PreCommit-HIVE-Build

> Fix incompatible change in TimestampColumnVector to default to UTC
> --
>
> Key: HIVE-20135
> URL: https://issues.apache.org/jira/browse/HIVE-20135
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Fix For: 3.1.0, 4.0.0, storage-2.7.0
>
> Attachments: HIVE-20135.01.patch, HIVE-20135.02.patch, 
> HIVE-20135.patch
>
>
> HIVE-20007 changed the default for TimestampColumnVector to be to use UTC, 
> which breaks the API compatibility with storage-api 2.6.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-12 Thread Junjie Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.4.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-12 Thread Junjie Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542455#comment-16542455
 ] 

Junjie Chen commented on HIVE-17593:


The previous unit test failure (vectorized_parquet_types.q) is because of 
different length UDF used for CHAR.  

When performing query in non-vectorized mode, GenericUDFLength is used to 
calculate length of column, it converts the primitive value to string by using 
PrimitiveObjectInspectorUtil.getString, in which the tailing spaces is ignored 
for CHAR type.
However, when performing query in vectorized mode, StringLength is used to 
calculate the length of column, it treats column as byte array and doesn't 
consider the column type. 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20135) Fix incompatible change in TimestampColumnVector to default to UTC

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542450#comment-16542450
 ] 

Hive QA commented on HIVE-20135:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12572/dev-support/hive-personality.sh
 |
| git revision | master / 20eb7b5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| modules | C: storage-api U: storage-api |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12572/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix incompatible change in TimestampColumnVector to default to UTC
> --
>
> Key: HIVE-20135
> URL: https://issues.apache.org/jira/browse/HIVE-20135
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
> Fix For: 3.1.0, 4.0.0, storage-2.7.0
>
> Attachments: HIVE-20135.01.patch, HIVE-20135.02.patch, 
> HIVE-20135.patch
>
>
> HIVE-20007 changed the default for TimestampColumnVector to be to use UTC, 
> which breaks the API compatibility with storage-api 2.6.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542443#comment-16542443
 ] 

Hive QA commented on HIVE-20006:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931331/HIVE-20006.06.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12571/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12571/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12571/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Removing standalone-metastore/src/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 20eb7b5 HIVE-20097 : Convert standalone-metastore to a submodule 
(Alexander Kolbasov reviewed by Vihang Karajgaonkar)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-13 02:54:18.459
+ rm -rf ../yetus_PreCommit-HIVE-Build-12571
+ mkdir ../yetus_PreCommit-HIVE-Build-12571
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12571
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12571/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/MaterializedViewTask.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java: does 
not exist in index
error: 
a/ql/src/test/queries/clientpositive/materialized_view_create_rewrite_time_window.q:
 does not exist in index
error: a/ql/src/test/results/clientpositive/druid/druidmini_mv.q.out: does not 
exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_5.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_time_window.q.out:
 does not exist in index
error: 
a/ql/src/test/results/clientpositive/llap/materialized_view_rewrite_empty.q.out:
 does not exist in index
error: a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: 
does not exist in index
error: a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h: 
does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp:
 does not exist in index
error: a/standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp: 
does not exist in index
error: a/standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h: 
does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CreationMetadata.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FindSchemasByColsResp.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Materialization.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SchemaVersion.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMFullResourcePlan.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetAllResourcePlanResponse.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMGetTriggersForResourePlanResponse.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/WMValidateResourcePlanResponse.java:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php:
 does not exist in index
error: a/standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php: does 
not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote:
 does not exist in index
error: 
a/standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py:
 does not exist in index
error:

[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542441#comment-16542441
 ] 

Hive QA commented on HIVE-18705:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931322/HIVE-18705.9.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12570/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12570/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12570/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-07-13 02:52:52.708
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-12570/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-07-13 02:52:52.711
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   57dd304..20eb7b5  master -> origin/master
   04ea145..93b9cdd  master-txnstats -> origin/master-txnstats
+ git reset --hard HEAD
HEAD is now at 57dd304 HIVE-20037: Print root cause exception's toString() 
rather than getMessage() (Aihua Xu, reviewed by Sahil Takiar)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 20eb7b5 HIVE-20097 : Convert standalone-metastore to a submodule 
(Alexander Kolbasov reviewed by Vihang Karajgaonkar)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-07-13 02:52:56.190
+ rm -rf ../yetus_PreCommit-HIVE-Build-12570
+ mkdir ../yetus_PreCommit-HIVE-Build-12570
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-12570
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12570/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:
 does not exist in index
error: 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
 does not exist in index
error: src/java/org/apache/hadoop/hive/ql/metadata/TableIterable.java: does not 
exist in index
error: src/test/org/apache/hadoop/hive/ql/metadata/TestTableIterable.java: does 
not exist in index
error: src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java: 
does not exist in index
error: src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: does 
not exist in index
error: src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: 
does not exist in index
error: java/org/apache/hadoop/hive/ql/metadata/TableIterable.java: does not 
exist in index
error: test/org/apache/hadoop/hive/ql/metadata/TestTableIterable.java: does not 
exist in index
error: java/org/apache/hive/service/cli/operation/GetColumnsOperation.java: 
does not exist in index
error: main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: does not 
exist in index
error: main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: 
does not exist in index
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-12570
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931322 - PreCommit-HIVE-Build

> Improve HiveMetaStoreClient.dropDatabase
> 
>
> Key: HIVE-18705
> URL: https://issues.apache.org/jira/browse/HIVE-18705
> Project: Hive
>  Issue Type: Improvement
>

[jira] [Commented] (HIVE-19486) Discrepancy in HikariCP config naming

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542440#comment-16542440
 ] 

Hive QA commented on HIVE-19486:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931305/HIVE-19486.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14650 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12569/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12569/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12569/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931305 - PreCommit-HIVE-Build

> Discrepancy in HikariCP config naming
> -
>
> Key: HIVE-19486
> URL: https://issues.apache.org/jira/browse/HIVE-19486
> Project: Hive
>  Issue Type: Bug
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-19486.1.patch, HIVE-19486.2.patch
>
>
> HiveConf hive.conf.restricted.list contains "hikari." instead of "hikaricp."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19486) Discrepancy in HikariCP config naming

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542417#comment-16542417
 ] 

Hive QA commented on HIVE-19486:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12569/dev-support/hive-personality.sh
 |
| git revision | master / 20eb7b5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12569/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Discrepancy in HikariCP config naming
> -
>
> Key: HIVE-19486
> URL: https://issues.apache.org/jira/browse/HIVE-19486
> Project: Hive
>  Issue Type: Bug
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-19486.1.patch, HIVE-19486.2.patch
>
>
> HiveConf hive.conf.restricted.list contains "hikari." instead of "hikaricp."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled

2018-07-12 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542413#comment-16542413
 ] 

Sahil Takiar commented on HIVE-20032:
-

As for benchmarking, I have done a lot of TPC-DS benchmarking, and I don't 
consistently get better performance. However, the amount of shuffled data is 
significantly reduced (as well as the amount of data spilled to disk). My guess 
is that latency doesn't improve much because I'm running my tests on a unloaded 
cluster. However, I expect cluster throughput to be better with this patch 
since less I/O resources are being used. I'll need to run some concurrent 
TPC-DS workloads to confirm this though.

> Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
> -
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled

2018-07-12 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542409#comment-16542409
 ] 

Sahil Takiar commented on HIVE-20032:
-

[~lirui] thanks for taking a look. So I took a closer look at this, and I think 
there might be a way to specify custom serializers just for shuffles. However, 
it require accessing some lower-level Spark APIs. The idea is that RDD 
operations such as {{SortByKey}} and {{repartitionAndSortWithinPartitions}} 
return a {{ShuffledRDD}}. The {{ShuffledRDD}} object has a method called 
{{setSerializer}} that allows users to set a custom serializer for that RDD. 
Certain RDD APIs such as {{combineByKey}} expose setting a custom serializer 
via  invoking the {{ShuffledRDD#setSerializer}} method, however, it doesn't 
look like {{sortByKey}} or {{repartitionAndSortWithinPartitions}} does. I think 
this is probably better than my original approach.

The other issue is that specifying a customer serializer doesn't work with the 
way we currently shade Kryo in {{hive-exec}} (I think you found similar issues 
while working on HIVE-15104). So I had to remove the relocation for Kryo (which 
was added in HIVE-5915). Hopefully thats ok since Spark and Hive use the same 
version of Kryo.

I attached an updated patch (still a WIP) that implements this approach.

> Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
> -
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled

2018-07-12 Thread Sahil Takiar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-20032:

Attachment: HIVE-20032.4.patch

> Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
> -
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542405#comment-16542405
 ] 

Sergey Shelukhin commented on HIVE-19820:
-

Fixed a bunch more tests and paths. Few tests still fail/have bad result 
changes.
Most out file changes that remain are trivial.

autoColumnStats_10,autoColumnStats_2,stats_analyze_decimal_compare - suspicious 
stats change.
create_or_replace_view - has a very strange error where SQL and ORM have 
different results w.r.t. write ID, need to investigate, probably smth stupid.


> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, 
> HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, 
> branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19820:

Attachment: HIVE-19820.03.patch

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.03.patch, 
> HIVE-19820.04-master-txnstats.patch, HIVE-19820.patch, 
> branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19820:

Attachment: branch-19820.03.nogen.patch

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.04-master-txnstats.patch, 
> HIVE-19820.patch, branch-19820.02.nogen.patch, branch-19820.03.nogen.patch, 
> branch-19820.nogen.patch, branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17852) remove support for list bucketing "stored as directories" in 3.0

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542400#comment-16542400
 ] 

Hive QA commented on HIVE-17852:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 88m 
 1s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
18s{color} | {color:blue} standalone-metastore in master has 217 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
58s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
58s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 58s{color} 
| {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  5m 
18s{color} | {color:red} standalone-metastore: The patch generated 438 new + 
19074 unchanged - 441 fixed = 19512 total (was 19515) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 27m 
55s{color} | {color:red} ql: The patch generated 699 new + 129209 unchanged - 
788 fixed = 129908 total (was 129997) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 59m 
47s{color} | {color:red} root: The patch generated 1136 new + 246803 unchanged 
- 1228 fixed = 247939 total (was 248031) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
34s{color} | {color:red} itests/hive-unit: The patch generated 6 new + 11887 
unchanged - 6 fixed = 11893 total (was 11893) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
15s{color} | {color:red} patch/standalone-metastore cannot run 
setBugDatabaseInfo from findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} patch/metastore cannot run setBugDatabaseInfo from 
findbugs {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
5s{color} | {color:red} patch/ql cannot run setBugDatabaseInfo from findbugs 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
43s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
55s{color} | {color:red} ql generated 2 new + 98 unchanged - 2 fixed = 100 
total (was 100) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  7m  
9s{color} | {color:red} root generated 2 new + 369 unchanged - 2 fixed = 371 
total (was 371) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}275m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux

[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542398#comment-16542398
 ] 

Hive QA commented on HIVE-19924:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931299/HIVE-19924.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 14619 tests 
executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
TestJdbcWithMiniHS2ErasureCoding - did not produce a TEST-*.xml file (likely 
timed out) (batchId=250)
TestNoSaslAuth - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacro 
(batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroDoesNotExist
 (batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroExistsDoNotIgnoreErrors
 (batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroNonExistentWithIfExists
 (batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroNonExistentWithIfExistsDoNotIgnoreNonExistent
 (batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testOneInputParamters 
(batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testThreeInputParamters
 (batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testTwoInputParamters 
(batchId=288)
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testZeroInputParamters
 (batchId=288)
org.apache.hive.jdbc.TestJdbcDriver2.testGetQueryId (batchId=249)
org.apache.hive.jdbc.TestJdbcDriver2.testReplErrorScenarios (batchId=249)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill 
(batchId=250)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveBackKill 
(batchId=250)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill
 (batchId=250)
org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerSlowQueryExecutionTime
 (batchId=247)
org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerTotalLaunchedTasks 
(batchId=247)
org.apache.hive.jdbc.TestTriggersNoTezSessionPool.testTriggerVertexTotalTasks 
(batchId=247)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testMultipleTriggers1 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testMultipleTriggers2 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsMultiInsert
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedFiles
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomReadOps 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDagRawInputSplitsKill
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDagTotalTasks 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerDefaultRawInputSplits
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighBytesRead 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerShortQueryElapsedTime
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryElapsedTime
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerSlowQueryExecutionTime
 (batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerTotalTasks 
(batchId=250)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerVertexRawInputSplitsKill
 (batchId=250)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12568/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12568/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12568/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 37 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931299 - PreCommit-HIVE-Build

> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
>

[jira] [Commented] (HIVE-20117) schema changes for txn stats

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542391#comment-16542391
 ] 

Jesus Camacho Rodriguez commented on HIVE-20117:


[~sershe], I think you are referring to HIVE-19027 that landed in 3.1, and its 
clone HIVE-20006 that will land in master? If that is the case, these diffs 
should be fixed soon, HIVE-20006 has not landed yet because I did not get a 
clean QA...

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20117) schema changes for txn stats

2018-07-12 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542390#comment-16542390
 ] 

Sergey Shelukhin commented on HIVE-20117:
-

Updated simple patch. As of now the stats tables don't need write ID, the flag 
is in TBLS table anyway so we ahve to check and update that.
We might move it later.

[~vgarg] 3.0-to-3.1 upgrade script is currently inconsistent between branch-3 
and master (some changes are only on master and I think should be reverted 
given that they are not actually going to be part of 3.1 cc [~sankarh], some 
only on branch-3 and must be committed to master together cc 
[~jcamachorodriguez]).
So, this won't apply to branch-3. I will update the patch once this situation 
is resolved, or feel free to update/commit where necessary for 3.1 release.

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20117) schema changes for txn stats

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20117:

Attachment: HIVE-20117.01.patch

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20117) schema changes for txn stats

2018-07-12 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-20117:

Attachment: (was: HIVE-20117.patch)

> schema changes for txn stats
> 
>
> Key: HIVE-20117
> URL: https://issues.apache.org/jira/browse/HIVE-20117
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-20117.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20097:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Patch merged into master. Thanks [~akolb]!

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, 
> HIVE-20097.06.patch, HIVE-20097.07.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19924) Tag distcp jobs run by Repl Load

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542386#comment-16542386
 ] 

Hive QA commented on HIVE-19924:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
58s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
39s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
48s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 1 new + 55 unchanged - 12 
fixed = 56 total (was 67) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} service: The patch generated 2 new + 123 unchanged - 0 
fixed = 125 total (was 123) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12568/dev-support/hive-personality.sh
 |
| git revision | master / 57dd304 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus/diff-checkstyle-service.txt
 |
| modules | C: ql service itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12568/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Tag distcp jobs run by Repl Load
> 
>
> Key: HIVE-19924
> URL: https://issues.apache.org/jira/browse/HIVE-19924
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, replication
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19924.01.patch, HIVE-19924.02.patch
>
>
> Add tags in jobconf for distcp related jobs started by

[jira] [Updated] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version

2018-07-12 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20164:
--
Attachment: HIVE-20164.1.patch

> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.1.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version

2018-07-12 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20164:
--
Status: Patch Available  (was: Open)

> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20164.1.patch
>
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-17683) Annotate Query Plan with locking information

2018-07-12 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542380#comment-16542380
 ] 

Eugene Koifman edited comment on HIVE-17683 at 7/13/18 12:54 AM:
-

[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

 

Don't know if it's related to your changes but in explain_locks.q.out

{{explain locks drop table test_explain_locks}}

doesn't acquire any locks - this is odd - I'd expect X lock on the table for a 
drop command.

 

Why did you chose to output the data as JSON?  


was (Author: ekoifman):
[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20164) Murmur Hash : Make sure CTAS and IAS use correct bucketing version

2018-07-12 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-20164:
-


> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> --
>
> Key: HIVE-20164
> URL: https://issues.apache.org/jira/browse/HIVE-20164
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> With the migration to Murmur hash, CTAS and IAS from old table version to new 
> table version does not work as intended and data is hashed using old hash 
> logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17683) Annotate Query Plan with locking information

2018-07-12 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542380#comment-16542380
 ] 

Eugene Koifman commented on HIVE-17683:
---

[~ikryvenko], sorry, it took a while to get back to this.

Your implementation creates ExplainTask.getJsonLocks() which duplicates a lot 
of the logic in DbTxnManger.acquireLocks().  This is problematic because they 
have to be kept in sync.

Could you refactor it so that they share code?

For example, create a {{LockRequest makeLockRequest(List, 
List)}} and use it in both places?

 

Also, the refactoring in acquireLocks() lost
{noformat}
default:
  throw new IllegalArgumentException(String
  .format("Lock type [%s] for Database.Table [%s.%s] is unknown", lockType, 
t.getDbName(),
  t.getTableName()
  ));{noformat}
This may change how errors are surfaced - not sure it's a good idea.

> Annotate Query Plan with locking information
> 
>
> Key: HIVE-17683
> URL: https://issues.apache.org/jira/browse/HIVE-17683
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Critical
> Attachments: HIVE-17683.01.patch, HIVE-17683.02.patch
>
>
> Explore if it's possible to add info about what locks will be asked for to 
> the query plan.
> Lock acquisition (for Acid Lock Manager) is done in 
> DbTxnManager.acquireLocks() which is called once the query starts running.  
> Would need to refactor that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12342) Set default value of hive.optimize.index.filter to true

2018-07-12 Thread Deepak Jaiswal (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542375#comment-16542375
 ] 

Deepak Jaiswal commented on HIVE-12342:
---

[~ikryvenko] I was looking at the change in ParseContext.java where HashMap is 
converted to LinkedHashMap. Can you please tell me why it was needed?

> Set default value of hive.optimize.index.filter to true
> ---
>
> Key: HIVE-12342
> URL: https://issues.apache.org/jira/browse/HIVE-12342
> Project: Hive
>  Issue Type: Task
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Igor Kryvenko
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-12342.05.patch, HIVE-12342.06.patch, 
> HIVE-12342.07.patch, HIVE-12342.08.patch, HIVE-12342.09.patch, 
> HIVE-12342.1.patch, HIVE-12342.10.patch, HIVE-12342.11.patch, 
> HIVE-12342.12.patch, HIVE-12342.13.patch, HIVE-12342.14.patch, 
> HIVE-12342.15.patch, HIVE-12342.16.patch, HIVE-12342.17.patch, 
> HIVE-12342.18.patch, HIVE-12342.19.patch, HIVE-12342.2.patch, 
> HIVE-12342.20.patch, HIVE-12342.21.patch, HIVE-12342.22.patch, 
> HIVE-12342.23.patch, HIVE-12342.24.patch, HIVE-12342.3.patch, 
> HIVE-12342.4.patch, HIVE-12342.patch
>
>
> This configuration governs ppd for storage layer. When applicable, it will 
> always help. It should be on by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542352#comment-16542352
 ] 

Hive QA commented on HIVE-17896:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931291/HIVE-17896.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14656 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12567/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12567/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12567/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931291 - PreCommit-HIVE-Build

> TopNKey: Create a standalone vectorizable TopNKey operator
> --
>
> Key: HIVE-17896
> URL: https://issues.apache.org/jira/browse/HIVE-17896
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-17896.1.patch, HIVE-17896.10.patch, 
> HIVE-17896.11.patch, HIVE-17896.3.patch, HIVE-17896.4.patch, 
> HIVE-17896.5.patch, HIVE-17896.6.patch, HIVE-17896.7.patch, 
> HIVE-17896.8.patch, HIVE-17896.9.patch
>
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
> group-by operator buffers up all the rows before discarding the 99% of the 
> rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the 
> filtering on the shuffle keys, but it is better to do this before breaking 
> the vectors into rows and losing the isRepeating properties.
> Adding a TopN Key operator in the physical operator tree allows the following 
> to happen.
> GBY->RS(Top=1)
> can become 
> TNK(1)->GBY->RS(Top=1)
> So that, the TopNKey can remove rows before they are buffered into the GBY 
> and consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the 
> GBY is on keys "a,b,c" and the TopNKey is on just "a".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542341#comment-16542341
 ] 

Hive QA commented on HIVE-17896:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} common in master has 64 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
8s{color} | {color:blue} serde in master has 194 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  5m  
9s{color} | {color:blue} ql in master has 2289 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
57s{color} | {color:red} ql: The patch generated 35 new + 426 unchanged - 0 
fixed = 461 total (was 426) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
58s{color} | {color:red} serde generated 1 new + 194 unchanged - 0 fixed = 195 
total (was 194) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
55s{color} | {color:red} ql generated 8 new + 2289 unchanged - 0 fixed = 2297 
total (was 2289) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 15s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:serde |
|  |  
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object[],
 ObjectInspector[], Object[], ObjectInspector[], boolean[]) negates the return 
value of 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object,
 ObjectInspector, Object, ObjectInspector)  At 
ObjectInspectorUtils.java:negates the return value of 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(Object,
 ObjectInspector, Object, ObjectInspector)  At ObjectInspectorUtils.java:[line 
956] |
| FindBugs | module:ql |
|  |  new 
org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[],
 ObjectInspector[], boolean[]) may expose internal representation by storing an 
externally mutable object into 
TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc  At 
TopNKeyOperator.java:expose internal representation by storing an externally 
mutable object into TopNKeyOperator$KeyWrapperComparator.columnSortOrderIsDesc  
At TopNKeyOperator.java:[line 71] |
|  |  new 
org.apache.hadoop.hive.ql.exec.TopNKeyOperator$KeyWrapperComparator(ObjectInspector[],
 ObjectInspector[], boolean[]) may expose internal representation by storing an 
externally mutable object into 
TopNKeyOperator$KeyWrapperComparator.objectInspectors1  At 
TopNKeyOperator.java:expose internal representation by storing an externally 
mutable object into

[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19360:
---
Attachment: (was: HIVE-19360.6.patch)

> CBO: Add an "optimizedSQL" to QueryPlan object 
> ---
>
> Key: HIVE-19360
> URL: https://issues.apache.org/jira/browse/HIVE-19360
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Diagnosability
>Affects Versions: 3.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, 
> HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch
>
>
> Calcite RelNodes can be converted back into SQL (as the new JDBC storage 
> handler does), which allows Hive to print out the post CBO plan as a SQL 
> query instead of having to guess the join orders from the subsequent Tez plan.
> The query generated might not be always valid SQL at this point, but is a 
> world ahead of DAG plans in readability.
> Eg. tpc-ds Query4 CTEs gets expanded to
> {code}
> SELECT t16.$f3 customer_preferred_cust_flag
> FROM
>   (SELECT t0.c_customer_id $f0,
>SUM((t2.ws_ext_list_price - 
> t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / 
> CAST(2 AS DECIMAL(10, 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t0
>INNER JOIN (
>  (SELECT ws_sold_date_sk,
>  ws_bill_customer_sk,
>  ws_ext_discount_amt,
>  ws_ext_sales_price,
>  ws_ext_wholesale_cost,
>  ws_ext_list_price
>   FROM default.web_sales
>   WHERE ws_bill_customer_sk IS NOT NULL
> AND ws_sold_date_sk IS NOT NULL) t2
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = 
> t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk
>GROUP BY t0.c_customer_id,
> t0.c_first_name,
> t0.c_last_name,
> t0.c_preferred_cust_flag,
> t0.c_birth_country,
> t0.c_login,
> t0.c_email_address) t7
> INNER JOIN (
>   (SELECT t9.c_customer_id $f0,
>t9.c_preferred_cust_flag $f3,
> 
> SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - 
> t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, 
> 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t9
>INNER JOIN (
>  (SELECT ss_sold_date_sk,
>  ss_customer_sk,
>  ss_ext_discount_amt,
>  ss_ext_sales_price,
>  ss_ext_wholesale_cost,
>  ss_ext_list_price
>   FROM default.store_sales
>   WHERE ss_customer_sk IS NOT NULL
> AND ss_sold_date_sk IS NOT NULL) t11
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t13 ON 
> t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk
>GROUP BY t9.c_customer_id,
> t9.c_first_name,
> t9.c_last_name,
> t9.c_preferred_cust_flag,
> t9.c_birth_country,
>

[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19360:
---
Attachment: HIVE-19360.6.patch

> CBO: Add an "optimizedSQL" to QueryPlan object 
> ---
>
> Key: HIVE-19360
> URL: https://issues.apache.org/jira/browse/HIVE-19360
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Diagnosability
>Affects Versions: 3.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, 
> HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch
>
>
> Calcite RelNodes can be converted back into SQL (as the new JDBC storage 
> handler does), which allows Hive to print out the post CBO plan as a SQL 
> query instead of having to guess the join orders from the subsequent Tez plan.
> The query generated might not be always valid SQL at this point, but is a 
> world ahead of DAG plans in readability.
> Eg. tpc-ds Query4 CTEs gets expanded to
> {code}
> SELECT t16.$f3 customer_preferred_cust_flag
> FROM
>   (SELECT t0.c_customer_id $f0,
>SUM((t2.ws_ext_list_price - 
> t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / 
> CAST(2 AS DECIMAL(10, 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t0
>INNER JOIN (
>  (SELECT ws_sold_date_sk,
>  ws_bill_customer_sk,
>  ws_ext_discount_amt,
>  ws_ext_sales_price,
>  ws_ext_wholesale_cost,
>  ws_ext_list_price
>   FROM default.web_sales
>   WHERE ws_bill_customer_sk IS NOT NULL
> AND ws_sold_date_sk IS NOT NULL) t2
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = 
> t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk
>GROUP BY t0.c_customer_id,
> t0.c_first_name,
> t0.c_last_name,
> t0.c_preferred_cust_flag,
> t0.c_birth_country,
> t0.c_login,
> t0.c_email_address) t7
> INNER JOIN (
>   (SELECT t9.c_customer_id $f0,
>t9.c_preferred_cust_flag $f3,
> 
> SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - 
> t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, 
> 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t9
>INNER JOIN (
>  (SELECT ss_sold_date_sk,
>  ss_customer_sk,
>  ss_ext_discount_amt,
>  ss_ext_sales_price,
>  ss_ext_wholesale_cost,
>  ss_ext_list_price
>   FROM default.store_sales
>   WHERE ss_customer_sk IS NOT NULL
> AND ss_sold_date_sk IS NOT NULL) t11
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t13 ON 
> t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk
>GROUP BY t9.c_customer_id,
> t9.c_first_name,
> t9.c_last_name,
> t9.c_preferred_cust_flag,
> t9.c_birth_country,
>

[jira] [Updated] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19360:
---
Attachment: HIVE-19360.6.patch

> CBO: Add an "optimizedSQL" to QueryPlan object 
> ---
>
> Key: HIVE-19360
> URL: https://issues.apache.org/jira/browse/HIVE-19360
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Diagnosability
>Affects Versions: 3.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch, 
> HIVE-19360.3.patch, HIVE-19360.4.patch, HIVE-19360.5.patch, HIVE-19360.6.patch
>
>
> Calcite RelNodes can be converted back into SQL (as the new JDBC storage 
> handler does), which allows Hive to print out the post CBO plan as a SQL 
> query instead of having to guess the join orders from the subsequent Tez plan.
> The query generated might not be always valid SQL at this point, but is a 
> world ahead of DAG plans in readability.
> Eg. tpc-ds Query4 CTEs gets expanded to
> {code}
> SELECT t16.$f3 customer_preferred_cust_flag
> FROM
>   (SELECT t0.c_customer_id $f0,
>SUM((t2.ws_ext_list_price - 
> t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / 
> CAST(2 AS DECIMAL(10, 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t0
>INNER JOIN (
>  (SELECT ws_sold_date_sk,
>  ws_bill_customer_sk,
>  ws_ext_discount_amt,
>  ws_ext_sales_price,
>  ws_ext_wholesale_cost,
>  ws_ext_list_price
>   FROM default.web_sales
>   WHERE ws_bill_customer_sk IS NOT NULL
> AND ws_sold_date_sk IS NOT NULL) t2
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = 
> t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk
>GROUP BY t0.c_customer_id,
> t0.c_first_name,
> t0.c_last_name,
> t0.c_preferred_cust_flag,
> t0.c_birth_country,
> t0.c_login,
> t0.c_email_address) t7
> INNER JOIN (
>   (SELECT t9.c_customer_id $f0,
>t9.c_preferred_cust_flag $f3,
> 
> SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - 
> t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, 
> 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t9
>INNER JOIN (
>  (SELECT ss_sold_date_sk,
>  ss_customer_sk,
>  ss_ext_discount_amt,
>  ss_ext_sales_price,
>  ss_ext_wholesale_cost,
>  ss_ext_list_price
>   FROM default.store_sales
>   WHERE ss_customer_sk IS NOT NULL
> AND ss_sold_date_sk IS NOT NULL) t11
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t13 ON 
> t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk
>GROUP BY t9.c_customer_id,
> t9.c_first_name,
> t9.c_last_name,
> t9.c_preferred_cust_flag,
> t9.c_birth_country,
>

[jira] [Commented] (HIVE-20095) Fix jdbc external table feature

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542339#comment-16542339
 ] 

Jesus Camacho Rodriguez commented on HIVE-20095:


[~daijy], the motivation for that is that currently we can generate complex SQL 
statement for the part of the query that we push to the storage handler 
automatically from Calcite. Hence, the types for that TableScan should be 
coming from the output of that query rather than from the Table schema. I think 
an easy fix would be that if query has been generated by Hive, then we could 
use ResultSet to determine the schema; otherwise we fallback to using table 
schema, which is what [~msydoron] did.
Since the number of dialects supported from Calcite will be limited, this means 
we should also include a map for the types in those databases towards Hive. We 
can do that later on. Does it make sense?

[~msydoron], there seem to be be some test failures still.

> Fix jdbc external table feature
> ---
>
> Key: HIVE-20095
> URL: https://issues.apache.org/jira/browse/HIVE-20095
> Project: Hive
>  Issue Type: Bug
>Reporter: Jonathan Doron
>Assignee: Jonathan Doron
>Priority: Major
> Attachments: HIVE-20095.1.patch, HIVE-20095.2.patch, 
> HIVE-20095.3.patch
>
>
> It seems like the committed code for HIVE-19161 
> (7584b3276bebf64aa006eaa162c0a6264d8fcb56) reverted some of HIVE-18423 
> updates, and therefore some of the external table queries are not working 
> correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Alexander Kolbasov (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-20097:
--
Attachment: HIVE-20097.07.patch

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, 
> HIVE-20097.06.patch, HIVE-20097.07.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Alexander Kolbasov (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542331#comment-16542331
 ] 

Alexander Kolbasov commented on HIVE-20097:
---

Looks like findbugs wasn't happy because it couldn't find findbugs-exclude.xml 
file - added it under standalonemetastore-metastore-common/findbugs.

Patch 7 contains the change.

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, 
> HIVE-20097.06.patch, HIVE-20097.07.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Attachment: HIVE-18038.9.patch

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Status: Open  (was: Patch Available)

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Attachment: (was: HIVE-18038.9.patch)

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20163) Simplify StringSubstrColStart Initialization

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20163:
---
Status: Patch Available  (was: Open)

> Simplify StringSubstrColStart Initialization
> 
>
> Key: HIVE-20163
> URL: https://issues.apache.org/jira/browse/HIVE-20163
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-20163.1.patch
>
>
> * Remove code
> * Remove exception handling
> * Remove {{printStackTrace}} call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20163) Simplify StringSubstrColStart Initialization

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20163:
---
Attachment: HIVE-20163.1.patch

> Simplify StringSubstrColStart Initialization
> 
>
> Key: HIVE-20163
> URL: https://issues.apache.org/jira/browse/HIVE-20163
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-20163.1.patch
>
>
> * Remove code
> * Remove exception handling
> * Remove {{printStackTrace}} call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20163) Simplify StringSubstrColStart Initialization

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-20163:
--

Assignee: BELUGA BEHR

> Simplify StringSubstrColStart Initialization
> 
>
> Key: HIVE-20163
> URL: https://issues.apache.org/jira/browse/HIVE-20163
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-20163.1.patch
>
>
> * Remove code
> * Remove exception handling
> * Remove {{printStackTrace}} call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Status: Patch Available  (was: In Progress)

Yup. This is getting embarrassing. HA.

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18038) org.apache.hadoop.hive.ql.session.OperationLog - Review

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-18038:
---
Attachment: HIVE-18038.9.patch

> org.apache.hadoop.hive.ql.session.OperationLog - Review
> ---
>
> Key: HIVE-18038
> URL: https://issues.apache.org/jira/browse/HIVE-18038
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18038.1.patch, HIVE-18038.2.patch, 
> HIVE-18038.3.patch, HIVE-18038.4.patch, HIVE-18038.5.patch, 
> HIVE-18038.6.patch, HIVE-18038.7.patch, HIVE-18038.8.patch, HIVE-18038.9.patch
>
>
> Simplifications, improve readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19940) Push predicates with deterministic UDFs with RBO

2018-07-12 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-19940:
---
Attachment: HIVE-19940.4.patch

> Push predicates with deterministic UDFs with RBO
> 
>
> Key: HIVE-19940
> URL: https://issues.apache.org/jira/browse/HIVE-19940
> Project: Hive
>  Issue Type: Improvement
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
>Priority: Major
> Attachments: HIVE-19940.1.patch, HIVE-19940.2.patch, 
> HIVE-19940.3.patch, HIVE-19940.4.patch
>
>
> With RBO, predicates with any UDF doesn't get pushed down.  It makes sense to 
> not pushdown the predicates with non-deterministic function as the meaning of 
> the query changes after the predicate is resolved to use the function.  But 
> pushing a deterministic function is beneficial.
> Test Case:
> {code}
> set hive.cbo.enable=false;
> CREATE TABLE `testb`(
>`cola` string COMMENT '',
>`colb` string COMMENT '',
>`colc` string COMMENT '')
> PARTITIONED BY (
>`part1` string,
>`part2` string,
>`part3` string)
> STORED AS AVRO;
> CREATE TABLE `testa`(
>`col1` string COMMENT '',
>`col2` string COMMENT '',
>`col3` string COMMENT '',
>`col4` string COMMENT '',
>`col5` string COMMENT '')
> PARTITIONED BY (
>`part1` string,
>`part2` string,
>`part3` string)
> STORED AS AVRO;
> insert into testA partition (part1='US', part2='ABC', part3='123')
> values ('12.34', '100', '200', '300', 'abc'),
> ('12.341', '1001', '2001', '3001', 'abcd');
> insert into testA partition (part1='UK', part2='DEF', part3='123')
> values ('12.34', '100', '200', '300', 'abc'),
> ('12.341', '1001', '2001', '3001', 'abcd');
> insert into testA partition (part1='US', part2='DEF', part3='200')
> values ('12.34', '100', '200', '300', 'abc'),
> ('12.341', '1001', '2001', '3001', 'abcd');
> insert into testA partition (part1='CA', part2='ABC', part3='300')
> values ('12.34', '100', '200', '300', 'abc'),
> ('12.341', '1001', '2001', '3001', 'abcd');
> insert into testB partition (part1='CA', part2='ABC', part3='300')
> values ('600', '700', 'abc'), ('601', '701', 'abcd');
> insert into testB partition (part1='CA', part2='ABC', part3='400')
> values ( '600', '700', 'abc'), ( '601', '701', 'abcd');
> insert into testB partition (part1='UK', part2='PQR', part3='500')
> values ('600', '700', 'abc'), ('601', '701', 'abcd');
> insert into testB partition (part1='US', part2='DEF', part3='200')
> values ( '600', '700', 'abc'), ('601', '701', 'abcd');
> insert into testB partition (part1='US', part2='PQR', part3='123')
> values ( '600', '700', 'abc'), ('601', '701', 'abcd');
> -- views with deterministic functions
> create view viewDeterministicUDFA partitioned on (vpart1, vpart2, vpart3) as 
> select
>  cast(col1 as decimal(38,18)) as vcol1,
>  cast(col2 as decimal(38,18)) as vcol2,
>  cast(col3 as decimal(38,18)) as vcol3,
>  cast(col4 as decimal(38,18)) as vcol4,
>  cast(col5 as char(10)) as vcol5,
>  cast(part1 as char(2)) as vpart1,
>  cast(part2 as char(3)) as vpart2,
>  cast(part3 as char(3)) as vpart3
>  from testa
> where part1 in ('US', 'CA');
> create view viewDeterministicUDFB partitioned on (vpart1, vpart2, vpart3) as 
> select
>  cast(cola as decimal(38,18)) as vcolA,
>  cast(colb as decimal(38,18)) as vcolB,
>  cast(colc as char(10)) as vcolC,
>  cast(part1 as char(2)) as vpart1,
>  cast(part2 as char(3)) as vpart2,
>  cast(part3 as char(3)) as vpart3
>  from testb
> where part1 in ('US', 'CA');
> explain
> select vcol1, vcol2, vcol3, vcola, vcolb
> from viewDeterministicUDFA a inner join viewDeterministicUDFB b
> on a.vpart1 = b.vpart1
> and a.vpart2 = b.vpart2
> and a.vpart3 = b.vpart3
> and a.vpart1 = 'US'
> and a.vpart2 = 'DEF'
> and a.vpart3 = '200';
> {code}
> Plan where the CAST is not pushed down.
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: testa
> filterExpr: (part1) IN ('US', 'CA') (type: boolean)
> Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: CAST( col1 AS decimal(38,18)) (type: 
> decimal(38,18)), CAST( col2 AS decimal(38,18)) (type: decimal(38,18)), CAST( 
> col3 AS decimal(38,18)) (type: decimal(38,18)), CAST( part1 AS CHAR(2)) 
> (type: char(2)), CAST( part2 AS CHAR(3)) (type: char(3)), CAST( part3 AS 
> CHAR(3)) (type: char(3))
>   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7
>   Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE 
> Column stats: NONE
>   Filter Operator
> predicate: ((_col5 = 'US') and (_col6 = 'DEF') and (_col7 = 
> '200')) (type: boolean)
>

[jira] [Commented] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Alexander Kolbasov (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542316#comment-16542316
 ] 

Alexander Kolbasov commented on HIVE-20097:
---

Patch 6 fixes rat violations - the only change is the addition of this block to 
{{standalone-metastore/pom.xml}}:

{code}
  

  

  org.apache.rat
  apache-rat-plugin
  0.10
  

  binary-package-licenses/**
  DEV-README
  **/src/main/sql/**
  **/README.md
  **/*.iml
  **/*.txt
  **/*.log
  **/*.arcconfig
  **/package-info.java
  **/*.properties
  **/*.q
  **/*.q.out
  **/*.xml
  **/gen/**
  **/patchprocess/**
  **/metastore_db/**

  

  

  
{code}

The patch is merged to 
{code}
* commit 57dd30441a708f9fe653aea1c54df678ed459c34 (origin/master, origin/HEAD)
| Author: Aihua Xu 
| Date:   Fri Jun 29 14:40:43 2018 -0700
| 
| HIVE-20037: Print root cause exception's toString() rather than 
getMessage() (Aihua Xu, reviewed by Sahil Takiar)
{code}

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, 
> HIVE-20097.06.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20090) Extend creation of semijoin reduction filters to be able to discover new opportunities

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20090:
---
Attachment: HIVE-20090.07.patch

> Extend creation of semijoin reduction filters to be able to discover new 
> opportunities
> --
>
> Key: HIVE-20090
> URL: https://issues.apache.org/jira/browse/HIVE-20090
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20090.01.patch, HIVE-20090.02.patch, 
> HIVE-20090.04.patch, HIVE-20090.05.patch, HIVE-20090.06.patch, 
> HIVE-20090.07.patch
>
>
> Assume the following plan:
> {noformat}
> TS[0] - RS[1] - JOIN[4] - RS[5] - JOIN[8] - FS[9]
> TS[2] - RS[3] - JOIN[4] 
> TS[6] - RS[7] - JOIN[8]
> {noformat}
> Currently, {{TS\[6\]}} may only be reduced with the output of {{RS\[5\]}}, 
> i.e., input to join between both subplans.
> However, it may be useful to consider other possibilities too, e.g., reduced 
> by the output of {{RS\[1\]}} or {{RS\[3\]}}. For instance, this is important 
> when, given a large plan, an edge between {{RS[5]}} and {{TS[0]}} would 
> create a cycle, while an edge between {{RS[1]}} and {{TS[6]}} would not.
> This patch comprises two parts. First, it creates additional predicates when 
> possible. Secondly, it removes duplicate semijoin reduction 
> branches/predicates, e.g., if another semijoin that consumes the output of 
> the same expression already reduces a certain table scan operator (heuristic, 
> since this may not result in most efficient plan in all cases). Ultimately, 
> the decision on whether to use one or another should be cost-driven 
> (follow-up).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Alexander Kolbasov (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-20097:
--
Attachment: HIVE-20097.06.patch

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch, 
> HIVE-20097.06.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.03.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: In Progress  (was: Patch Available)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17852) remove support for list bucketing "stored as directories" in 3.0

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542311#comment-16542311
 ] 

Hive QA commented on HIVE-17852:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931284/HIVE-17852.20.patch

{color:green}SUCCESS:{color} +1 due to 36 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14647 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=154)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12566/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12566/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12566/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931284 - PreCommit-HIVE-Build

> remove support for list bucketing "stored as directories" in 3.0
> 
>
> Key: HIVE-17852
> URL: https://issues.apache.org/jira/browse/HIVE-17852
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Laszlo Bodor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17852.01.patch, HIVE-17852.02.patch, 
> HIVE-17852.03.patch, HIVE-17852.04.patch, HIVE-17852.05.patch, 
> HIVE-17852.06.patch, HIVE-17852.07.patch, HIVE-17852.08.patch, 
> HIVE-17852.09.patch, HIVE-17852.10.patch, HIVE-17852.11.patch, 
> HIVE-17852.12.patch, HIVE-17852.13.patch, HIVE-17852.14.patch, 
> HIVE-17852.15.patch, HIVE-17852.16.patch, HIVE-17852.17.patch, 
> HIVE-17852.18.patch, HIVE-17852.19.patch, HIVE-17852.20.patch
>
>
> From the email thread:
> 1) LB, when stored as directories, adds a lot of low-level complexity to Hive 
> tables that has to be accounted for in many places in the code where the 
> files are written or modified - from FSOP to ACID/replication/export.
> 2) While working on some FSOP code I noticed that some of that logic is 
> broken - e.g. the duplicate file removal from tasks, a pretty fundamental 
> correctness feature in Hive, may be broken. LB also doesn’t appear to be 
> compatible with e.g. regular bucketing.
> 3) The feature hasn’t seen development activity in a while; it also doesn’t 
> appear to be used a lot.
> Keeping with the theme of cleaning up “legacy” code for 3.0, I was proposing 
> we remove it.
> (2) also suggested that, if needed, it might be easier to implement similar 
> functionality by adding some flexibility to partitions (which LB directories 
> look like anyway); that would also keep the logic on a higher level of 
> abstraction (split generation, partition pruning) as opposed to many 
> low-level places like FSOP, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20151) External table: exception while storing stats

2018-07-12 Thread Jaume M (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542288#comment-16542288
 ] 

Jaume M commented on HIVE-20151:


Duplicate from https://issues.apache.org/jira/browse/HIVE-19316 [~kgyrtkirk]?

> External table: exception while storing stats
> -
>
> Key: HIVE-20151
> URL: https://issues.apache.org/jira/browse/HIVE-20151
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Statistics
>Reporter: Zoltan Haindrich
>Priority: Major
>
> {code}
> create external table e3(a integer,b string,c double);
> -- goes well
> insert into e3 values(1,'2',3);
> -- takes a while:
> insert into e3 values(1,'2',3);
> -- after 2 minutes
> --
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.StatsTask
> INFO  : Completed executing 
> command(queryId=hive_20180712120342_6893e234-44a0-4e48-8320-f1699557bae3); 
> Time taken: 125.276 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.StatsTask (state=08S01,code=1)
> {code}
> exception in metastore logs:
> {code}
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to 
> org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector
>  at 
> org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1084)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7514)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at sun.reflect.GeneratedMethodAccessor80.invoke(Unknown Source) ~[?:?]
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_161]
>  at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_161]
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at com.sun.proxy.$Proxy34.set_aggr_stats_for(Unknown Source) ~[?:?]
>  at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:17017)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
>  at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:17001)
>  ~[hive-exec-3.1.0.3.0.0.0-1632.jar:3.1.0.3.0.0.0-1632]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20158) Do Not Print StackTraces to STDERR in Base64TextOutputFormat

2018-07-12 Thread BELUGA BEHR (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542278#comment-16542278
 ] 

BELUGA BEHR commented on HIVE-20158:


The class {{Base64TextInputFormat}} has the same issue.

> Do Not Print StackTraces to STDERR in Base64TextOutputFormat
> 
>
> Key: HIVE-20158
> URL: https://issues.apache.org/jira/browse/HIVE-20158
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Priority: Trivial
>  Labels: newbie, noob
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/base64/Base64TextOutputFormat.java
> {code}
>   try {
> String signatureString = 
> job.get("base64.text.output.format.signature");
> if (signatureString != null) {
>   signature = signatureString.getBytes("UTF-8");
> } else {
>   signature = new byte[0];
> }
>   } catch (UnsupportedEncodingException e) {
> e.printStackTrace();
>   }
> {code}
> The {{UnsupportedEncodingException}} is coming from the {{getBytes}} method 
> call.  Instead, use the {{CharSet}} version of the method and it doesn't 
> throw this explicit exception so the 'try' block can simply be removed.  
> Every JVM will support UTF-8.
> https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#getBytes(java.nio.charset.Charset)
> https://docs.oracle.com/javase/7/docs/api/java/nio/charset/StandardCharsets.html#UTF_8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20159) Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20159:
---
Description: 
https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121

{code}
} catch (IOException e) {
  e.printStackTrace();
}
{code}

Introduce an SLF4J logger to this class and print a WARN level log message if 
the {{IOException}} from {{Utilities.listStatusIfExists}} is generated.  I 
suggest WARN because the entire operation doesn't fail if this error happens.  
It continues on its way with the data that it was able to collect.  I'm not 
sure if this is the intended behavior, but for now, a helpful warning message 
in the logging would be better.

  was:
https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121

{code}
} catch (IOException e) {
  e.printStackTrace();
}
{code}

Introduce an SLF4J logger to this class and print a WARN level log message if 
the {{IOException}} from {{Utilities.listStatusIfExists}} is generated.  I 
suggest WARN because the entire operation doesn't fail if this error happens.  
It continues on its way with the data that it was able to collect.  I'm not 
sure if this is the intended behavior, but for now, an error message in the 
logging would be better.


> Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin
> -
>
> Key: HIVE-20159
> URL: https://issues.apache.org/jira/browse/HIVE-20159
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
>
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java#L121
> {code}
> } catch (IOException e) {
>   e.printStackTrace();
> }
> {code}
> Introduce an SLF4J logger to this class and print a WARN level log message if 
> the {{IOException}} from {{Utilities.listStatusIfExists}} is generated.  I 
> suggest WARN because the entire operation doesn't fail if this error happens. 
>  It continues on its way with the data that it was able to collect.  I'm not 
> sure if this is the intended behavior, but for now, a helpful warning message 
> in the logging would be better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542265#comment-16542265
 ] 

Misha Dmitriev commented on HIVE-19668:
---

Thank you for checking, [~vihangk1] [~aihuaxu] and [~stakiar]. In the end, it 
turns out that at least some failures are reproducible locally, and my changes 
are responsible. Not all {{CommonToken}}s can be made {{ImmutableToken}}s, 
because for some of them the type may be rewritten in some special operators 
later. I've already found one such type in the past, and now eliminating 
others. Will post the updated patch once I am done.

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542258#comment-16542258
 ] 

Sahil Takiar commented on HIVE-19668:
-

[~mi...@cloudera.com] unless you can re-produce the test failures locally, its 
unlikely the failed tests are related to your patch. If you can re-produce them 
locally, let me know the stack-trace and I can help you debug.

Otherwise, can you re-base the patch and post and updated version? This will 
re-trigger Hive QA and re-run all the tests.

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20157) Do Not Print StackTraces to STDERR in ParseDriver

2018-07-12 Thread BELUGA BEHR (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20157:
---
Summary: Do Not Print StackTraces to STDERR in ParseDriver  (was: Do Not 
Print StackTraces to STDERR)

> Do Not Print StackTraces to STDERR in ParseDriver
> -
>
> Key: HIVE-20157
> URL: https://issues.apache.org/jira/browse/HIVE-20157
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
>  Labels: newbie, noob
>
> {{org/apache/hadoop/hive/ql/parse/ParseDriver.java}}
> {code}
> catch (RecognitionException e) {
>   e.printStackTrace();
>   throw new ParseException(parser.errors);
> }
> {code}
> Do not use {{e.printStackTrace()}} and print to STDERR.  Either remove or 
> replace with a debug-level log statement.  I would vote to simply remove.  
> There are several occurrences of this pattern in this class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20147) Hive streaming ingest is contented on synchronized logging

2018-07-12 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542254#comment-16542254
 ] 

Prasanth Jayachandran commented on HIVE-20147:
--

If the clients of this API does not use async logging, there is high contention 
in the logging done by the streaming API specifically logStats before/after 
commit and close. Changed the log levels to DEBUG. 

In one of the test application using async logging + this patch, the log 
contention is no longer seen in the profiles. 

> Hive streaming ingest is contented on synchronized logging
> --
>
> Key: HIVE-20147
> URL: https://issues.apache.org/jira/browse/HIVE-20147
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 
> PM.png, sync-logger-contention.svg
>
>
> In one of the observed profile, >30% time spent on synchronized logging. See 
> attachment. 
> We should use async logging for hive streaming ingest by default.  !Screen 
> Shot 2018-07-11 at 4.17.27 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20147) Hive streaming ingest is contented on synchronized logging

2018-07-12 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20147:
-
Attachment: HIVE-20147.1.patch

> Hive streaming ingest is contented on synchronized logging
> --
>
> Key: HIVE-20147
> URL: https://issues.apache.org/jira/browse/HIVE-20147
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 
> PM.png, sync-logger-contention.svg
>
>
> In one of the observed profile, >30% time spent on synchronized logging. See 
> attachment. 
> We should use async logging for hive streaming ingest by default.  !Screen 
> Shot 2018-07-11 at 4.17.27 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20147) Hive streaming ingest is contented on synchronized logging

2018-07-12 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20147:
-
Status: Patch Available  (was: Open)

> Hive streaming ingest is contented on synchronized logging
> --
>
> Key: HIVE-20147
> URL: https://issues.apache.org/jira/browse/HIVE-20147
> Project: Hive
>  Issue Type: Bug
>  Components: Streaming, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20147.1.patch, Screen Shot 2018-07-11 at 4.17.27 
> PM.png, sync-logger-contention.svg
>
>
> In one of the observed profile, >30% time spent on synchronized logging. See 
> attachment. 
> We should use async logging for hive streaming ingest by default.  !Screen 
> Shot 2018-07-11 at 4.17.27 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used

2018-07-12 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-19886:
---
Attachment: HIVE-19886.3.patch
Status: Patch Available  (was: In Progress)

> Logs may be directed to 2 files if --hiveconf hive.log.file is used
> ---
>
> Key: HIVE-19886
> URL: https://issues.apache.org/jira/browse/HIVE-19886
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19886.2.patch, HIVE-19886.2.patch, 
> HIVE-19886.3.patch, HIVE-19886.patch
>
>
> hive launch script explicitly specific log4j2 configuration file to use. The 
> main() methods in HiveServer2 and HiveMetastore reconfigures the logger based 
> on user input via --hiveconf hive.log.file. This may cause logs to end up in 
> 2 different files. Initial logs goes to the file specified in 
> hive-log4j2.properties and after logger reconfiguration the rest of the logs 
> goes to the file specified via --hiveconf hive.log.file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19886) Logs may be directed to 2 files if --hiveconf hive.log.file is used

2018-07-12 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-19886:
---
Status: In Progress  (was: Patch Available)

> Logs may be directed to 2 files if --hiveconf hive.log.file is used
> ---
>
> Key: HIVE-19886
> URL: https://issues.apache.org/jira/browse/HIVE-19886
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19886.2.patch, HIVE-19886.2.patch, HIVE-19886.patch
>
>
> hive launch script explicitly specific log4j2 configuration file to use. The 
> main() methods in HiveServer2 and HiveMetastore reconfigures the logger based 
> on user input via --hiveconf hive.log.file. This may cause logs to end up in 
> 2 different files. Initial logs goes to the file specified in 
> hive-log4j2.properties and after logger reconfiguration the rest of the logs 
> goes to the file specified via --hiveconf hive.log.file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-07-12 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-20037:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~stakiar] for reviewing.

> Print root cause exception's toString() rather than getMessage()
> 
>
> Key: HIVE-20037
> URL: https://issues.apache.org/jira/browse/HIVE-20037
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-20037.1.patch, HIVE-20037.2.patch
>
>
> When we run HoS job and if it fails for some errors, we are printing the 
> exception message rather than exception toString(), for some exceptions, 
> e.g., this java.lang.NoClassDefFoundError, we are missing the exception type 
> information. 
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> org/apache/spark/SparkConf)'
> {noformat}
> If we use exception's toString(), it will be as follows and make more sense.
> {noformat}
> Failed to execute Spark task Stage-1, with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark 
> client for Spark session cf054497-b073-4327-a315-68c867ce3434: 
> java.lang.NoClassDefFoundError: org/apache/spark/SparkConf)'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20149) TestHiveCli failing/timing out

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542197#comment-16542197
 ] 

Hive QA commented on HIVE-20149:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931275/HIVE-20149.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14650 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12565/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12565/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12565/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931275 - PreCommit-HIVE-Build

> TestHiveCli failing/timing out
> --
>
> Key: HIVE-20149
> URL: https://issues.apache.org/jira/browse/HIVE-20149
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20149.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-20153:
---

Assignee: Aihua Xu

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20102) Add a couple of additional tests for query parsing

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20102:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-3, branch-3.1. Thanks [~ashutoshc]

> Add a couple of additional tests for query parsing
> --
>
> Key: HIVE-20102
> URL: https://issues.apache.org/jira/browse/HIVE-20102
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, 
> HIVE-20102.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542161#comment-16542161
 ] 

Aihua Xu commented on HIVE-20153:
-

[~szehon] Nice to see you again. :) I will take a look. Do you have the full 
heap dump? If it's too big, you may try to use http://www.jxray.com/ to 
generate a small file.

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20149) TestHiveCli failing/timing out

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542157#comment-16542157
 ] 

Hive QA commented on HIVE-20149:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} beeline in master has 53 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} beeline: The patch generated 0 new + 39 unchanged - 
1 fixed = 39 total (was 40) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12565/dev-support/hive-personality.sh
 |
| git revision | master / 3fa7f0c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: beeline U: beeline |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12565/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TestHiveCli failing/timing out
> --
>
> Key: HIVE-20149
> URL: https://issues.apache.org/jira/browse/HIVE-20149
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20149.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19820) add ACID stats support to background stats updater and fix bunch of edge cases found in SU tests

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542138#comment-16542138
 ] 

Hive QA commented on HIVE-19820:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931270/branch-19820.02.nogen.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12564/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12564/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12564/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12931270/branch-19820.02.nogen.patch
 was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931270 - PreCommit-HIVE-Build

> add ACID stats support to background stats updater and fix bunch of edge 
> cases found in SU tests
> 
>
> Key: HIVE-19820
> URL: https://issues.apache.org/jira/browse/HIVE-19820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19820.01-master-txnstats.patch, 
> HIVE-19820.01.patch, HIVE-19820.02-master-txnstats.patch, 
> HIVE-19820.03-master-txnstats.patch, HIVE-19820.04-master-txnstats.patch, 
> HIVE-19820.patch, branch-19820.02.nogen.patch, branch-19820.nogen.patch, 
> branch-19820.nogen.patch
>
>
> Follow-up from HIVE-19418.
> Right now it checks whether stats are valid in an old-fashioned way... and 
> also gets ACID state, and discards it without using.
> When ACID stats are implemented, ACID state needs to be used to do 
> version-aware valid stats checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542136#comment-16542136
 ] 

Hive QA commented on HIVE-20102:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931248/HIVE-20102.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14650 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12563/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12563/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12563/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931248 - PreCommit-HIVE-Build

> Add a couple of additional tests for query parsing
> --
>
> Key: HIVE-20102
> URL: https://issues.apache.org/jira/browse/HIVE-20102
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, 
> HIVE-20102.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20102) Add a couple of additional tests for query parsing

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542133#comment-16542133
 ] 

Hive QA commented on HIVE-20102:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} ql in master has 2288 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 1 new + 492 unchanged - 0 
fixed = 493 total (was 492) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
10s{color} | {color:red} root: The patch generated 1 new + 493 unchanged - 0 
fixed = 494 total (was 493) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-12563/dev-support/hive-personality.sh
 |
| git revision | master / e0c2b9d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus/diff-checkstyle-root.txt
 |
| modules | C: ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-12563/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add a couple of additional tests for query parsing
> --
>
> Key: HIVE-20102
> URL: https://issues.apache.org/jira/browse/HIVE-20102
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20102.01.patch, HIVE-20102.02.patch, 
> HIVE-20102.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2018-07-12 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542117#comment-16542117
 ] 

Sahil Takiar commented on HIVE-20079:
-

Looks similar to HIVE-16887

> Populate more accurate rawDataSize for parquet format
> -
>
> Key: HIVE-20079
> URL: https://issues.apache.org/jira/browse/HIVE-20079
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-20079.1.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 
> (that is the number of fields) incorrectly. We need to populate correct data 
> size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
>   COLUMN_STATS_ACCURATE   true
>   numFiles1
>   numRows 2
>   rawDataSize 4
>   totalSize   373
>   transient_lastDdlTime   1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20155) Semijoin Reduction : Put all the min-max filters before all the bloom filters

2018-07-12 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-20155:
-


> Semijoin Reduction : Put all the min-max filters before all the bloom filters
> -
>
> Key: HIVE-20155
> URL: https://issues.apache.org/jira/browse/HIVE-20155
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> If there are more than 1 semijoin reduction filters, apply all min-max 
> filters before any of the bloom filters are applied as bloom filter lookup is 
> expensive.
>  
> cc [~gopalv] [~jdere]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20142) Semijoin Reduction : Peform cost based removal after rule based removal.

2018-07-12 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20142:
--
Description: 
The semijoin reduction removal logic is spread out into multiple functions. 
Currently, the cost based removal logic is applied before the rule based(dumb) 
ones. 

Instead, apply the rule based removal logic and then apply the cost based 
removal.

 

cc [~jdere] [~jcamachorodriguez] [~gopalv]

  was:
The semijoin reduction removal logic is spread out into multiple functions. 
Currently, the cost based removal logic is applied before the rule based(dumb) 
ones. 

Instead, apply the rule based removal logic and then apply the cost based 
removal.

 

cc [~jdere] [~jcamachorodriguez]


> Semijoin Reduction : Peform cost based removal after rule based removal.
> 
>
> Key: HIVE-20142
> URL: https://issues.apache.org/jira/browse/HIVE-20142
> Project: Hive
>  Issue Type: Task
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> The semijoin reduction removal logic is spread out into multiple functions. 
> Currently, the cost based removal logic is applied before the rule 
> based(dumb) ones. 
> Instead, apply the rule based removal logic and then apply the cost based 
> removal.
>  
> cc [~jdere] [~jcamachorodriguez] [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output

2018-07-12 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20091:

Fix Version/s: 4.0.0
   3.1.0

> Tez: Add security credentials for FileSinkOperator output
> -
>
> Key: HIVE-20091
> URL: https://issues.apache.org/jira/browse/HIVE-20091
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, 
> HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, 
> HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch
>
>
> DagUtils needs to add security credentials for the output for the 
> FileSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output

2018-07-12 Thread Matt McCline (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-20091:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Tez: Add security credentials for FileSinkOperator output
> -
>
> Key: HIVE-20091
> URL: https://issues.apache.org/jira/browse/HIVE-20091
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, 
> HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, 
> HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch
>
>
> DagUtils needs to add security credentials for the output for the 
> FileSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output

2018-07-12 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542082#comment-16542082
 ] 

Matt McCline commented on HIVE-20091:
-

Committed to master and branch-3.

> Tez: Add security credentials for FileSinkOperator output
> -
>
> Key: HIVE-20091
> URL: https://issues.apache.org/jira/browse/HIVE-20091
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, 
> HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, 
> HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch
>
>
> DagUtils needs to add security credentials for the output for the 
> FileSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output

2018-07-12 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542079#comment-16542079
 ] 

Matt McCline commented on HIVE-20091:
-

Successful test run.

> Tez: Add security credentials for FileSinkOperator output
> -
>
> Key: HIVE-20091
> URL: https://issues.apache.org/jira/browse/HIVE-20091
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, 
> HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, 
> HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch
>
>
> DagUtils needs to add security credentials for the output for the 
> FileSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic DST-switching timezones

2018-07-12 Thread Vincent Tran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Tran updated HIVE-20154:

Summary: Improve unix_timestamp(args) to handle automatic DST-switching 
timezones  (was: Improve unix_timestamp(args) to handle automatic-DST switching 
timezones)

> Improve unix_timestamp(args) to handle automatic DST-switching timezones
> 
>
> Key: HIVE-20154
> URL: https://issues.apache.org/jira/browse/HIVE-20154
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 1.1.0
>Reporter: Vincent Tran
>Priority: Major
>
> Currently unix_timestamp(args) UDF will only handle static timezone 
> specifiers. It does not recognize SystemV specifiers such as EST5EDT or 
> PST8PDT.
> Based on this experiment, when z is used to parse a TZ string like UTC4PDT 
> (obviously not a valid SystemV specifier) - it will parse the time as UTC.
> When zz is used to parse a TZ string like UTC4PDT, it will take parse the 
> timestamp as the TZ of the final z position. This is demonstrated by my final 
> query when the format string z4z1z is used to parse UTC4PDT1EDT.
> {noformat}
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss 
> z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z4z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 23:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.047 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd 
> HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 20:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.055 seconds)
> 0: jdbc:hive2://localhost:1/default>;
> {noformat}
> So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are 
> valid to unix_timestamp(args) at all. And that those when parsed with the 
> zz format string, will be read as whatever valid timezone at the final 
> position (effectively EDT and PDT respectively in when those valid SystemV TZ 
> specifiers above are used).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic-DST switching timezones

2018-07-12 Thread Vincent Tran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Tran updated HIVE-20154:

Component/s: UDF

> Improve unix_timestamp(args) to handle automatic-DST switching timezones
> 
>
> Key: HIVE-20154
> URL: https://issues.apache.org/jira/browse/HIVE-20154
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 1.1.0
>Reporter: Vincent Tran
>Priority: Major
>
> Currently unix_timestamp(args) UDF will only handle static timezone 
> specifiers. It does not recognize SystemV specifiers such as EST5EDT or 
> PST8PDT.
> Based on this experiment, when z is used to parse a TZ string like UTC4PDT 
> (obviously not a valid SystemV specifier) - it will parse the time as UTC.
> When zz is used to parse a TZ string like UTC4PDT, it will take parse the 
> timestamp as the TZ of the final z position. This is demonstrated by my final 
> query when the format string z4z1z is used to parse UTC4PDT1EDT.
> {noformat}
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss 
> z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z4z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 23:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.047 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd 
> HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 20:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.055 seconds)
> 0: jdbc:hive2://localhost:1/default>;
> {noformat}
> So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are 
> valid to unix_timestamp(args) at all. And that those when parsed with the 
> zz format string, will be read as whatever valid timezone at the final 
> position (effectively EDT and PDT respectively in when those valid SystemV TZ 
> specifiers above are used).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20154) Improve unix_timestamp(args) to handle automatic-DST switching timezones

2018-07-12 Thread Vincent Tran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Tran updated HIVE-20154:

Affects Version/s: 1.1.0

> Improve unix_timestamp(args) to handle automatic-DST switching timezones
> 
>
> Key: HIVE-20154
> URL: https://issues.apache.org/jira/browse/HIVE-20154
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Vincent Tran
>Priority: Major
>
> Currently unix_timestamp(args) UDF will only handle static timezone 
> specifiers. It does not recognize SystemV specifiers such as EST5EDT or 
> PST8PDT.
> Based on this experiment, when z is used to parse a TZ string like UTC4PDT 
> (obviously not a valid SystemV specifier) - it will parse the time as UTC.
> When zz is used to parse a TZ string like UTC4PDT, it will take parse the 
> timestamp as the TZ of the final z position. This is demonstrated by my final 
> query when the format string z4z1z is used to parse UTC4PDT1EDT.
> {noformat}
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "-MM-dd HH:mm:ss 
> z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 16:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.041 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "-MM-dd 
> HH:mm:ss z4z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 23:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.047 seconds)
> 0: jdbc:hive2://localhost:1/default>; select 
> from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "-MM-dd 
> HH:mm:ss z4z1z"), "-MM-dd HH:mm:ss ");
> ++--+
> |_c0 |
> ++--+
> | 2018-01-31 20:00:00 Pacific Standard Time  |
> ++--+
> 1 row selected (0.055 seconds)
> 0: jdbc:hive2://localhost:1/default>;
> {noformat}
> So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are 
> valid to unix_timestamp(args) at all. And that those when parsed with the 
> zz format string, will be read as whatever valid timezone at the final 
> position (effectively EDT and PDT respectively in when those valid SystemV TZ 
> specifiers above are used).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20091) Tez: Add security credentials for FileSinkOperator output

2018-07-12 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542072#comment-16542072
 ] 

Hive QA commented on HIVE-20091:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931245/HIVE-20091.08.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14649 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/12562/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12562/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12562/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931245 - PreCommit-HIVE-Build

> Tez: Add security credentials for FileSinkOperator output
> -
>
> Key: HIVE-20091
> URL: https://issues.apache.org/jira/browse/HIVE-20091
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-20091.01.patch, HIVE-20091.02.patch, 
> HIVE-20091.03.patch, HIVE-20091.04.patch, HIVE-20091.05.patch, 
> HIVE-20091.06.patch, HIVE-20091.07.patch, HIVE-20091.08.patch
>
>
> DagUtils needs to add security credentials for the output for the 
> FileSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored

2018-07-12 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19375:
--
   Resolution: Fixed
Fix Version/s: 3.2.0
   Status: Resolved  (was: Patch Available)

committed to branch-3 and master

thanks Jason for the review

> Bad message: 'transactional'='false' is no longer a valid property and will 
> be ignored
> --
>
> Key: HIVE-19375
> URL: https://issues.apache.org/jira/browse/HIVE-19375
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HIVE-19375.01.patch
>
>
> from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}}
> {noformat}
> if ("false".equalsIgnoreCase(transactional)) {
>   // just drop transactional=false.  For backward compatibility in case 
> someone has scripts
>   // with transactional=false
>   LOG.info("'transactional'='false' is no longer a valid property and 
> will be ignored: " +
> Warehouse.getQualifiedName(newTable));
>   return;
> }
> {noformat}
> this msg is misleading since with metastore.create.as.acid=true, setting 
> transactional=false is valid to make a flat table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19375) Bad message: 'transactional'='false' is no longer a valid property and will be ignored

2018-07-12 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19375:
--
Summary: Bad message: 'transactional'='false' is no longer a valid property 
and will be ignored  (was: "'transactional'='false' is no longer a valid 
property and will be ignored: )

> Bad message: 'transactional'='false' is no longer a valid property and will 
> be ignored
> --
>
> Key: HIVE-19375
> URL: https://issues.apache.org/jira/browse/HIVE-19375
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-19375.01.patch
>
>
> from {{TransactionalValidationListener.handleCreateTableTransactionalProp()}}
> {noformat}
> if ("false".equalsIgnoreCase(transactional)) {
>   // just drop transactional=false.  For backward compatibility in case 
> someone has scripts
>   // with transactional=false
>   LOG.info("'transactional'='false' is no longer a valid property and 
> will be ignored: " +
> Warehouse.getQualifiedName(newTable));
>   return;
> }
> {noformat}
> this msg is misleading since with metastore.create.as.acid=true, setting 
> transactional=false is valid to make a flat table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19027) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19027:
---
Description: 
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity. If a materialized view uses an 
external table, the only way to trigger the rewriting would be to set the 
property to -1, since currently we do not capture for validation purposes 
whether the external source tables have been modified since the MV was created 
or not.

  was:
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity.


> Make materializations invalidation cache work with multiple active remote 
> metastores
> 
>
> Key: HIVE-19027
> URL: https://issues.apache.org/jira/browse/HIVE-19027
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
> HIVE-19027.03.patch, HIVE-19027.04.patch
>
>
> The main points:
>  - Only MVs that use transactional tables and are stored in transactional 
> tables can have a time window value of 0. Those are the only MVs that can be 
> guaranteed to not be outdated when a query is executed.
>  - For MVs that +cannot be outdated+, comparison is based on valid write id 
> lists.
>  - For MVs that +can be outdated+:
>  ** The window for valid outdated MVs can be specified in intervals of 1 
> minute.
>  ** A materialized view is outdated if it was built before that time window 
> and any source table has been modified since.
> A time window of -1 means to always use the materialized view for rewriting 
> without any checks concerning its validity. If a materialized view uses an 
> external table, the only way to trigger the rewriting would be to set the 
> property to -1, since currently we do not capture for validation purposes 
> whether the external source tables have been modified since the MV was 
> created or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20006:
---
Description: 
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity. If a materialized view uses an 
external table, the only way to trigger the rewriting would be to set the 
property to -1, since currently we do not capture for validation purposes 
whether the external source tables have been modified since the MV was created 
or not.

  was:
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity.


> Make materializations invalidation cache work with multiple active remote 
> metastores
> 
>
> Key: HIVE-20006
> URL: https://issues.apache.org/jira/browse/HIVE-20006
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
> HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, 
> HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, 
> HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch
>
>
> The main points:
>  - Only MVs that use transactional tables and are stored in transactional 
> tables can have a time window value of 0. Those are the only MVs that can be 
> guaranteed to not be outdated when a query is executed.
>  - For MVs that +cannot be outdated+, comparison is based on valid write id 
> lists.
>  - For MVs that +can be outdated+:
>  ** The window for valid outdated MVs can be specified in intervals of 1 
> minute.
>  ** A materialized view is outdated if it was built before that time window 
> and any source table has been modified since.
> A time window of -1 means to always use the materialized view for rewriting 
> without any checks concerning its validity. If a materialized view uses an 
> external table, the only way to trigger the rewriting would be to set the 
> property to -1, since currently we do not capture for validation purposes 
> whether the external source tables have been modified since the MV was 
> created or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542060#comment-16542060
 ] 

Jesus Camacho Rodriguez commented on HIVE-20006:


[~ashutoshc], done. Updated HIVE-19027 too.

> Make materializations invalidation cache work with multiple active remote 
> metastores
> 
>
> Key: HIVE-20006
> URL: https://issues.apache.org/jira/browse/HIVE-20006
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
> HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, 
> HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, 
> HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch
>
>
> The main points:
>  - Only MVs that use transactional tables and are stored in transactional 
> tables can have a time window value of 0. Those are the only MVs that can be 
> guaranteed to not be outdated when a query is executed.
>  - For MVs that +cannot be outdated+, comparison is based on valid write id 
> lists.
>  - For MVs that +can be outdated+:
>  ** The window for valid outdated MVs can be specified in intervals of 1 
> minute.
>  ** A materialized view is outdated if it was built before that time window 
> and any source table has been modified since.
> A time window of -1 means to always use the materialized view for rewriting 
> without any checks concerning its validity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19027) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19027:
---
Description: 
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity.

  was:
The main points:
 - Only MVs stored in transactional tables can have a time window value of 0. 
Those are the only MVs that can be guaranteed to not be outdated when a query 
is executed, if we use custom storage handlers to store the materialized view, 
we cannot make any promises.
 - For MVs that +cannot be outdated+, we do not check the metastore. Instead, 
comparison is based on valid write id lists.
 - For MVs that +can be outdated+, we still rely on the invalidation cache.
 ** The window for valid outdated MVs can be specified in intervals of 1 minute 
(less than that, it is difficult to have any guarantees about whether the MV is 
actually outdated by less than a minute or not).
 ** The async loading is done every interval / 2 (or probably better, we can 
make it configurable).


> Make materializations invalidation cache work with multiple active remote 
> metastores
> 
>
> Key: HIVE-19027
> URL: https://issues.apache.org/jira/browse/HIVE-19027
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Fix For: 3.1.0
>
> Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
> HIVE-19027.03.patch, HIVE-19027.04.patch
>
>
> The main points:
>  - Only MVs that use transactional tables and are stored in transactional 
> tables can have a time window value of 0. Those are the only MVs that can be 
> guaranteed to not be outdated when a query is executed.
>  - For MVs that +cannot be outdated+, comparison is based on valid write id 
> lists.
>  - For MVs that +can be outdated+:
>  ** The window for valid outdated MVs can be specified in intervals of 1 
> minute.
>  ** A materialized view is outdated if it was built before that time window 
> and any source table has been modified since.
> A time window of -1 means to always use the materialized view for rewriting 
> without any checks concerning its validity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20006) Make materializations invalidation cache work with multiple active remote metastores

2018-07-12 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20006:
---
Description: 
The main points:
 - Only MVs that use transactional tables and are stored in transactional 
tables can have a time window value of 0. Those are the only MVs that can be 
guaranteed to not be outdated when a query is executed.
 - For MVs that +cannot be outdated+, comparison is based on valid write id 
lists.
 - For MVs that +can be outdated+:
 ** The window for valid outdated MVs can be specified in intervals of 1 minute.
 ** A materialized view is outdated if it was built before that time window and 
any source table has been modified since.

A time window of -1 means to always use the materialized view for rewriting 
without any checks concerning its validity.

  was:
The main points:
 - Only MVs stored in transactional tables can have a time window value of 0. 
Those are the only MVs that can be guaranteed to not be outdated when a query 
is executed, if we use custom storage handlers to store the materialized view, 
we cannot make any promises.
 - For MVs that +cannot be outdated+, we do not check the metastore. Instead, 
comparison is based on valid write id lists.
 - For MVs that +can be outdated+, we still rely on the invalidation cache.
 ** The window for valid outdated MVs can be specified in intervals of 1 minute 
(less than that, it is difficult to have any guarantees about whether the MV is 
actually outdated by less than a minute or not).
 ** The async loading is done every interval / 2 (or probably better, we can 
make it configurable).


> Make materializations invalidation cache work with multiple active remote 
> metastores
> 
>
> Key: HIVE-20006
> URL: https://issues.apache.org/jira/browse/HIVE-20006
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-19027.01.patch, HIVE-19027.02.patch, 
> HIVE-19027.03.patch, HIVE-19027.04.patch, HIVE-20006.01.patch, 
> HIVE-20006.02.patch, HIVE-20006.03.patch, HIVE-20006.04.patch, 
> HIVE-20006.05.patch, HIVE-20006.06.patch, HIVE-20006.patch
>
>
> The main points:
>  - Only MVs that use transactional tables and are stored in transactional 
> tables can have a time window value of 0. Those are the only MVs that can be 
> guaranteed to not be outdated when a query is executed.
>  - For MVs that +cannot be outdated+, comparison is based on valid write id 
> lists.
>  - For MVs that +can be outdated+:
>  ** The window for valid outdated MVs can be specified in intervals of 1 
> minute.
>  ** A materialized view is outdated if it was built before that time window 
> and any source table has been modified since.
> A time window of -1 means to always use the materialized view for rewriting 
> without any checks concerning its validity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20097) Convert standalone-metastore to a submodule

2018-07-12 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542050#comment-16542050
 ] 

Vihang Karajgaonkar commented on HIVE-20097:


+1 

> Convert standalone-metastore to a submodule
> ---
>
> Key: HIVE-20097
> URL: https://issues.apache.org/jira/browse/HIVE-20097
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>Priority: Major
> Attachments: HIVE-20097.01.patch, HIVE-20097.02.patch, 
> HIVE-20097.03.patch, HIVE-20097.04.patch, HIVE-20097.05.patch
>
>
> This is a subtask to stage HIVE-17751 changes into several smaller phases.
> The first part is moving existing code in hive-standalone-metastore to a 
> sub-module.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20019) Ban commons-logging and log4j

2018-07-12 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20019:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> Ban commons-logging and log4j
> -
>
> Key: HIVE-20019
> URL: https://issues.apache.org/jira/browse/HIVE-20019
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20019.1.patch, HIVE-20019.2.patch, 
> HIVE-20019.3.patch, HIVE-20019.4.patch
>
>
> Still seeing several references to commons-logging. We should move all 
> classes to slf4j instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20019) Ban commons-logging and log4j

2018-07-12 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20019:
-
Summary: Ban commons-logging and log4j  (was: Remove commons-logging and 
move to slf4j)

> Ban commons-logging and log4j
> -
>
> Key: HIVE-20019
> URL: https://issues.apache.org/jira/browse/HIVE-20019
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20019.1.patch, HIVE-20019.2.patch, 
> HIVE-20019.3.patch, HIVE-20019.4.patch
>
>
> Still seeing several references to commons-logging. We should move all 
> classes to slf4j instead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 167 matches

Mail list logo