date:20181126

[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20971:
--
Attachment: HIVE-20971.2.patch

> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.2.patch, HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700013#comment-16700013
 ] 

Vihang Karajgaonkar commented on HIVE-20740:


Finally a green run. [~asherman] Updated the RB with the latest patch. There is 
no real code change since you last reviewed on the RB except for the fact that 
I rebased and was juggling through many unrelated failures on precommit.

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, 
> HIVE-20740.14.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: HIVE-20794.07
Status: Patch Available  (was: In Progress)

Out of the two failures reported the first failure 
TestCliDriver.testCliDriver[vector_groupby_reduce] is failing for previous runs 
as well. The diff there's because of floating point rounding error and is 
unrelated to changes in this patch.

I am running the test for the second failure locally and it has not failed for 
me. Hence re-attaching the patch to kick ptest again.

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07, 
> HIVE-20794.07
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Status: In Progress  (was: Patch Available)

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699958#comment-16699958
 ] 

Hive QA commented on HIVE-20794:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949603/HIVE-20794.07

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15627 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=196)

[druidmini_dynamic_partition.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_test_insert.q,druidkafkamini_delimited.q]
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=197)
[druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=61)
org.apache.hive.jdbc.TestActivePassiveHA.testConnectionActivePassiveHAServiceDiscovery
 (batchId=259)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15068/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15068/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15068/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949603 - PreCommit-HIVE-Build

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  #

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699950#comment-16699950
 ] 

Hive QA commented on HIVE-20794:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
15s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
5s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} The patch standalone-metastore passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 
fixed = 17 total (was 21) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch service passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch util passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
|

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: HIVE-20794.07
Status: Patch Available  (was: In Progress)

Fixed findbug notice in the last run and updated PR.

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Status: In Progress  (was: Patch Available)

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699886#comment-16699886
 ] 

Hive QA commented on HIVE-20740:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949566/HIVE-20740.14.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15543 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15066/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15066/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15066/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949566 - PreCommit-HIVE-Build

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, 
> HIVE-20740.14.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699869#comment-16699869
 ] 

Hive QA commented on HIVE-20740:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
4s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
46s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
21s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 608 
unchanged - 0 fixed = 609 total (was 608) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
15s{color} | {color:red} standalone-metastore/metastore-server generated 1 new 
+ 183 unchanged - 2 fixed = 184 total (was 185) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  
org.apache.hadoop.hive.metastore.PersistenceManagerProvider.updatePmfProperties(Configuration)
 does not release lock on all paths  At PersistenceManagerProvider.java:on all 
paths  At PersistenceManagerProvider.java:[line 152] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15066/dev-support/hive-personality.sh
 |
| git revision | master / 56625f3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus/new-findbugs-standalone-metastore_metastore-server.html
 |
| modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>

[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20954:
--
Fix Version/s: 4.0.0

Pushed to master

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20954:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException

2018-11-26 Thread slim bouguerra (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699828#comment-16699828
 ] 

slim bouguerra commented on HIVE-20955:
---

[~vgarg] +1 works fine, we can add the tests later. Please merge the The patch 
when you can.

 

> Calcite Rule HiveExpandDistinctAggregatesRule seems throwing 
> IndexOutOfBoundsException
> --
>
> Key: HIVE-20955
> URL: https://issues.apache.org/jira/browse/HIVE-20955
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: slim bouguerra
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20955.1.patch
>
>
>  
> Adde the following query to Druid test  
> ql/src/test/queries/clientpositive/druidmini_expressions.q
> {code}
> select count(distinct `__time`, cint) from (select * from 
> druid_table_alltypesorc) as src;
> {code}
> leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client 
> execution failed with error code = 4 running "\{code}
> with exception stack 
> {code}
> 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41)
>  ~[guava-19.0.jar:?]
>  at 
> org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) 
> ~[?:?]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) 
> ~[?:?]
>  at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:478)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12296)
>

[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699825#comment-16699825
 ] 

Hive QA commented on HIVE-20936:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949565/HIVE-20936.6.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15065/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15065/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15065/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949565 - PreCommit-HIVE-Build

> Allow the Worker thread in the metastore to run outside of it
> -
>
> Key: HIVE-20936
> URL: https://issues.apache.org/jira/browse/HIVE-20936
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, 
> HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch
>
>
> Currently the Worker thread in the metastore in bounded to the metastore, 
> mainly because of the TxnHandler that it has. This thread runs some map 
> reduce jobs which may not only be an option wherever the metastore is 
> running. A solution for this can be to run this thread in HS2 depending on a 
> flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699819#comment-16699819
 ] 

Teddy Choi commented on HIVE-20954:
---

The failures were not reproduced, and new 
TestTxnCommands2WithSplitUpdateAndVectorization failures seems not related. I 
will push it to master. Thanks.

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699814#comment-16699814
 ] 

Hive QA commented on HIVE-20936:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
15s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
2s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
26s{color} | {color:blue} hcatalog/streaming in master has 11 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} streaming in master has 2 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
28s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 2 new + 639 unchanged - 5 
fixed = 641 total (was 644) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 173 
unchanged - 0 fixed = 174 total (was 173) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 105 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
53s{color} | {color:red} ql generated 3 new + 2311 unchanged - 1 fixed = 2314 
total (was 2312) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
49s{color} | {color:red} standalone-metastore_metastore-common generated 1 new 
+ 16 unchanged - 0 fixed = 17 total (was 16) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Field MetaStoreCompactorThread.threadId masks field in superclass 
org.apache.hadoop.hive.ql.txn.compactor.CompactorThread  In 
MetaStoreCompactorThread.java:superclass 
org.apache.hadoop.hive.ql.txn.compactor.CompactorThread  In 
MetaStoreCompactorThread.java |
|  |  Field MetaStoreCompactorThread.rs masks field in superclass 
org.apache.hadoop.hive.ql.txn.compactor.CompactorThread  In 
MetaStoreCompactorThread.java:superclass 
org.apache.hadoop.hive.ql.txn.compactor.CompactorThread  In 
MetaStoreCompactorThread.java |
|  |

[jira] [Commented] (HIVE-20930) VectorCoalesce in FILTER mode doesn't take effect

2018-11-26 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699810#comment-16699810
 ] 

Ashutosh Chauhan commented on HIVE-20930:
-

+1

> VectorCoalesce in FILTER mode doesn't take effect
> -
>
> Key: HIVE-20930
> URL: https://issues.apache.org/jira/browse/HIVE-20930
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-20930.1.patch, HIVE-20930.2.patch, 
> HIVE-20930.3.patch
>
>
> HIVE-20277 fixed vectorized case expressions for FILTER, but VectorCoalesce 
> is still not fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException

2018-11-26 Thread slim bouguerra (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699808#comment-16699808
 ] 

slim bouguerra commented on HIVE-20955:
---

[~vgarg] worked for me 

[https://github.com/b-slim/hive/commit/34ed4421dc674cd0c487cc20eb6c47cd6f629495]

{code}

cd itests/qtest

mvn clean test -Dtest=TestMiniDruidCliDriver -Djava.net.preferIPv4Stack=true 
-Dtest.output.overwrite=true -s ~/.m2/settings.xml 
-Dqfile=druidmini_expressions.q

{code}

> Calcite Rule HiveExpandDistinctAggregatesRule seems throwing 
> IndexOutOfBoundsException
> --
>
> Key: HIVE-20955
> URL: https://issues.apache.org/jira/browse/HIVE-20955
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: slim bouguerra
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20955.1.patch
>
>
>  
> Adde the following query to Druid test  
> ql/src/test/queries/clientpositive/druidmini_expressions.q
> {code}
> select count(distinct `__time`, cint) from (select * from 
> druid_table_alltypesorc) as src;
> {code}
> leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client 
> execution failed with error code = 4 running "\{code}
> with exception stack 
> {code}
> 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41)
>  ~[guava-19.0.jar:?]
>  at 
> org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) 
> ~[?:?]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) 
> ~[?:?]
>  at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
>

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699804#comment-16699804
 ] 

Vihang Karajgaonkar commented on HIVE-20860:


Patch merged into master. Thanks [~jcamachorodriguez] for the review.

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20860:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699791#comment-16699791
 ] 

Hive QA commented on HIVE-20440:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949543/HIVE-20440.14.patch.txt

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15548 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate]
 (batchId=162)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15064/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15064/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15064/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949543 - PreCommit-HIVE-Build

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch, HIVE-20440.13.patch, HIVE-20440.14.patch.txt
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699755#comment-16699755
 ] 

Hive QA commented on HIVE-20440:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 
fixed = 54 total (was 56) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15064/dev-support/hive-personality.sh
 |
| git revision | master / e986fc5 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15064/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15064/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch,

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699744#comment-16699744
 ] 

Vihang Karajgaonkar commented on HIVE-20860:


Created HIVE-20972 to enable the test again.

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699737#comment-16699737
 ] 

Hive QA commented on HIVE-20971:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949541/HIVE-20971.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 15540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cli_print_escape_crlf] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer15] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_alter_list_bucketing_table1]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_escape] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_3] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_part6] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_4] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nonreserved_keywords_insert_into1]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge9] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_condition_remover]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd2] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[schema_evol_par_vec_table_dictionary_encoding]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[set_variable_sub] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_mapjoin2] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoinopt8] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_truncate] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bitwise_not] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_bucket] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit]
 (batchId=171)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15063/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15063/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15063/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949541 - PreCommit-HIVE-Build

> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-26 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20775:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~ashutoshc]!

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.06.patch, HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699711#comment-16699711
 ] 

Jesus Camacho Rodriguez commented on HIVE-20860:


+1

[~vihangk1], can we create the follow-up to enable it again in the future? 
Thanks

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699707#comment-16699707
 ] 

Vihang Karajgaonkar commented on HIVE-20860:


[~jcamachorodriguez] [~pvary] Can you please review?

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20860:
---
Status: Patch Available  (was: Open)

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20860:
---
Attachment: HIVE-20860.01.patch

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  HIVE-20860.01.patch, hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20740:
---
Attachment: HIVE-20740.14.patch

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, 
> HIVE-20740.14.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-20860:
--

Assignee: Vihang Karajgaonkar

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it

2018-11-26 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20936:
---
Status: Open  (was: Patch Available)

> Allow the Worker thread in the metastore to run outside of it
> -
>
> Key: HIVE-20936
> URL: https://issues.apache.org/jira/browse/HIVE-20936
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, 
> HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch
>
>
> Currently the Worker thread in the metastore in bounded to the metastore, 
> mainly because of the TxnHandler that it has. This thread runs some map 
> reduce jobs which may not only be an option wherever the metastore is 
> running. A solution for this can be to run this thread in HS2 depending on a 
> flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699685#comment-16699685
 ] 

Vihang Karajgaonkar commented on HIVE-20860:


If you look at the history of this test 
https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_cbo_limit_/history/
 this test fails frequently every few builds.

I am not really sure what causes the issue. I tried to reproduce it in atleast 
two different environments but it worked both the times. I also tried to run 
the test on the exact node of the precommit servers to see if there is 
something with that environment. But it worked there too.

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it

2018-11-26 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20936:
---
Attachment: HIVE-20936.6.patch
Status: Patch Available  (was: Open)

> Allow the Worker thread in the metastore to run outside of it
> -
>
> Key: HIVE-20936
> URL: https://issues.apache.org/jira/browse/HIVE-20936
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
> Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, 
> HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch
>
>
> Currently the Worker thread in the metastore in bounded to the metastore, 
> mainly because of the TxnHandler that it has. This thread runs some map 
> reduce jobs which may not only be an option wherever the metastore is 
> running. A solution for this can be to run this thread in HS2 depending on a 
> flag.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699677#comment-16699677
 ] 

Hive QA commented on HIVE-20971:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15063/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| modules | C: itests/hive-minikdc U: itests/hive-minikdc |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15063/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699662#comment-16699662
 ] 

Hive QA commented on HIVE-20775:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949535/HIVE-20775.06.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15062/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15062/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15062/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949535 - PreCommit-HIVE-Build

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.06.patch, HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699624#comment-16699624
 ] 

Vihang Karajgaonkar commented on HIVE-20740:


cbo_limit.q failure is unrelated to this patch. The failure message is weird, 
could be extra space somewhere in the q.out

[INFO] Running org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
[ERROR] Tests run: 30, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
405.018 s <<< FAILURE! - in 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
[ERROR] 
testCliDriver[cbo_limit](org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver) 
 Time elapsed: 9.876 s  <<< FAILURE!
java.lang.AssertionError: 
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_limit.q 
11c11
<  14   2
---
>  14   2

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699629#comment-16699629
 ] 

Vihang Karajgaonkar commented on HIVE-20860:


Failed again here
https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport/junit/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_cbo_limit_/

> Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
> --
>
> Key: HIVE-20860
> URL: https://issues.apache.org/jira/browse/HIVE-20860
> Project: Hive
>  Issue Type: Test
>Reporter: Vihang Karajgaonkar
>Priority: Minor
> Attachments: 
> 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt,
>  hive.log.gz, maven-test.txt
>
>
> Test failed in one of the precommit job. Looks like there is some case where 
> there is additonal space in the diff
> {noformat}
> Error Message
> Client Execution succeeded but contained differences (error code = 1) after 
> executing cbo_limit.q 
> 11c11
> <  1  4 2
> ---
> >  1 4 2
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699620#comment-16699620
 ] 

Hive QA commented on HIVE-20775:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
45s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 3 new + 123 unchanged - 2 
fixed = 126 total (was 125) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
1s{color} | {color:red} ql generated 2 new + 2310 unchanged - 2 fixed = 2312 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to tsRowSize in 
org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator,
 ExprNodeDesc, Statistics, ExprNodeDesc)  At 
TezCompiler.java:org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator,
 ExprNodeDesc, Statistics, ExprNodeDesc)  At TezCompiler.java:[line 1456] |
|  |  Should org.apache.hadoop.hive.ql.parse.TezCompiler$SemijoinOperatorInfo 
be a _static_ inner class?  At TezCompiler.java:inner class?  At 
TezCompiler.java:[lines 1661-1675] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15062/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus/new-findbugs-ql.html
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.06.patch, HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699601#comment-16699601
 ] 

Hive QA commented on HIVE-20740:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949531/HIVE-20740.13.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15540 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(batchId=182)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15061/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15061/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949531 - PreCommit-HIVE-Build

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-26 Thread Roohi Syeda (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda updated HIVE-20819:
---
Description: 
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed

 

*More Details :* 

When a new query is executed for a new session

 

The handler thread, calls line 66 HiveSessionImplwithUGI

(UserGroupInformation.createProxyUser(owner, 
UserGroupInformation.getLoginUser());

 

At *query compile time*, this sessionUgi is used to open MS connection by 
*handler* thread

Later at *query run time*, line 277 of SQLOperation

Runnable work = new BackgroundWork(getCurrentUGI(), 
parentSession.getSessionHive(), SessionState.get(),asyncPrepare);

 getCurrentUGI(); is used to create a new proxy user, which in turn calls 
Utils.getUGI (see below) and passed to the *Background* thread

 
{code:java}
public static UserGroupInformation getUGI() throws LoginException, IOException {

   String doAs = System.getenv("HADOOP_USER_NAME");

   if(doAs != null && doAs.length() > 0) {

/*

 * this allows doAs (proxy user) to be passed along across process boundary 
where

 * delegation tokens are not supported.  For example, a DDL stmt via 
WebHCat with

 * a doAs parameter, forks to 'hcat' which needs to start a Session that

 * proxies the end user

 */

 return UserGroupInformation.createProxyUser(doAs, 
UserGroupInformation.getLoginUser());

   }

   return UserGroupInformation.getCurrentUser();

 }
{code}
 

currentUGI creates a *new* proxyuser instance. This ugi is being set on the 
background thread

And when it is trying to get the Hive db in subsequent calls, we see that since 
the ugi’s are not equal (See the equals code below), a new connection is 
opened, which is never closed, by background thread.

Line 318 in Hive.java

 
{code:java}
 private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean 
isFastCheck,

 boolean doRegisterAllFns) throws HiveException {

   Hive db = hiveDB.get();

   if (db == null || !db.isCurrentUserOwner() || needsRefresh

   || (c != null && !isCompatible(db, c, isFastCheck)))

{  db = create(c, false, db, doRegisterAllFns);    }

   if (c != null)

{  db.conf = c;    }

   return db;

 }

 

private boolean isCurrentUserOwner() throws HiveException {

   try

{  return owner == null || 
owner.equals(UserGroupInformation.getCurrentUser());    }

catch(IOException e)

{  throw new HiveException("Error getting current user: " + e.getMessage(), 
e);    }

 }

/**

  * Compare the subjects to see if they are equal to each other.

  */

 @Override

 public boolean equals(Object o) {

   if (o == this)

{  return true;    }

else if (o == null || getClass() != o.getClass())

{  return false;    }

else

{  return subject == ((UserGroupInformation) o).subject;    }

 }

 
{code}
 

Solution:

When we assign *currentUGI* to the bg thread, we should call 
UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* 
method listed above (which creates a new instance of proxy user and subject and 
will always return isCurrentUserOwner as false, since both subject and ugi 
instances are different and hence creates a new MS connection)

 
{code:java}
/**

  * Return the current user, including any doAs in the current stack.

  */

 public synchronized

 static UserGroupInformation getCurrentUser() throws IOException {

   AccessControlContext context = AccessController.getContext();

   Subject subject = Subject.getSubject(context);

   if (subject == null || subject.getPrincipals(User.class).isEmpty())

{  return getLoginUser();    }

else

{  return new UserGroupInformation(subject);    }

 }
{code}
 

 

 

  was:
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed

 

*More Details :* 

When a new query is executed for a new session

 

The handler thread, calls line 66 HiveSessionImplwithUGI

(UserGroupInformation.createProxyUser(owner, 
UserGroupInformation.getLoginUser());

 

At *query compile time*, this sessionUgi is used to open MS connection by 
*handler* thread

Later at *query run time*, line 277 of SQLOperation

Runnable work = new BackgroundWork(getCurrentUGI(), 
parentSession.getSessionHive(), SessionState.get(),asyncPrepare);

 getCurrentUGI(); is used to create a new proxy user, which in turn calls 
Utils.getUGI (see below) and passed to the *Background* thread

 public static UserGroupInformation *getUGI*() throws LoginException, 
IOException {

   String doAs = System.getenv("HADOOP_USER_NAME");

   if(doAs != null && doAs.length() > 0)

{  

   /*  * this allows doAs (proxy user) to be passed along across process 
boundary where  * delegation tokens are not

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699576#comment-16699576
 ] 

Hive QA commented on HIVE-20740:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
6s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
21s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 608 
unchanged - 0 fixed = 609 total (was 608) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
17s{color} | {color:red} standalone-metastore/metastore-server generated 1 new 
+ 183 unchanged - 2 fixed = 184 total (was 185) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  
org.apache.hadoop.hive.metastore.PersistenceManagerProvider.updatePmfProperties(Configuration)
 does not release lock on all paths  At PersistenceManagerProvider.java:on all 
paths  At PersistenceManagerProvider.java:[line 152] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15061/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus/new-findbugs-standalone-metastore_metastore-server.html
 |
| modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>

[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-26 Thread Roohi Syeda (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda updated HIVE-20819:
---
Description: 
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed

 

*More Details :* 

When a new query is executed for a new session

 

The handler thread, calls line 66 HiveSessionImplwithUGI

(UserGroupInformation.createProxyUser(

  owner, UserGroupInformation.getLoginUser());

 

At *query compile time*, this sessionUgi is used to open MS connection by 
*handler* thread

Later at *query run time*, line 277 of SQLOperation

Runnable work =

  new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), 
SessionState.get(),

  asyncPrepare);

 

getCurrentUGI(); is used to create a new proxy user, which in turn calls 
Utils.getUGI (see below) and passed to the *Background* thread

 

 public static UserGroupInformation *getUGI*() throws LoginException, 
IOException {

    String doAs = System.getenv("HADOOP_USER_NAME");

    if(doAs != null && doAs.length() > 0) {

 /*

  * this allows doAs (proxy user) to be passed along across process 
boundary where

  * delegation tokens are not supported.  For example, a DDL stmt via 
WebHCat with

  * a doAs parameter, forks to 'hcat' which needs to start a Session that

  * proxies the end user

  */

  return UserGroupInformation.createProxyUser(doAs, 
UserGroupInformation.getLoginUser());

    }

    return UserGroupInformation.getCurrentUser();

  }

 

 

currentUGI creates a *new* proxyuser instance. This ugi is being set on the 
background thread

And when it is trying to get the Hive db in subsequent calls, we see that since 
the ugi’s are not equal (See the equals code below), a new connection is 
opened, which is never closed, by background thread.

Line 318 in Hive.java

 

  private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean 
isFastCheck,

  boolean doRegisterAllFns) throws HiveException {

    Hive db = hiveDB.get();

    if (db == null || !db.*isCurrentUserOwner*() || needsRefresh

    || (c != null && !isCompatible(db, c, isFastCheck))) {

  db = create(c, false, db, doRegisterAllFns);

    }

    if (c != null) {

  db.conf = c;

    }

    return db;

  }

 

 private boolean isCurrentUserOwner() throws HiveException {

    try {

  return owner == null || 
owner.equals(UserGroupInformation.getCurrentUser());

    } catch(IOException e) {

  throw new HiveException("Error getting current user: " + e.getMessage(), 
e);

    }

  }

 /**

   * Compare the subjects to see if they are equal to each other.

   */

  @Override

  public boolean *equals*(Object o) {

    if (o == this) {

  return true;

    } else if (o == null || getClass() != o.getClass()) {

  return false;

    } else {

  return subject == ((UserGroupInformation) o).subject;

    }

  }

 

Solution:

When we assign *currentUGI* to the bg thread, we should call 
UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* 
method listed above (which creates a new instance of proxy user and subject and 
will always return isCurrentUserOwner as false, since both subject and ugi 
instances are different and hence creates a new MS connection)

 

/**

   * Return the current user, including any doAs in the current stack.

 

   */

 

  public synchronized

  static UserGroupInformation getCurrentUser() throws IOException {

    AccessControlContext context = AccessController.getContext();

    Subject subject = Subject.getSubject(context);

    if (subject == null || subject.getPrincipals(User.class).isEmpty()) {

  return getLoginUser();

    } else {

  return new UserGroupInformation(subject);

    }

  }

 

 

  was:
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed


> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set
> -
>
> Key: HIVE-20819
> URL: https://issues.apache.org/jira/browse/HIVE-20819
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Roohi Syeda
>Assignee: Roohi Syeda
>Priority: Minor
> Attachments: HIVE-20819.1.patch
>
>
> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set.
> The connections created are in ESTABLISHED state and never closed
>  
> *More Details :* 
> When a new query is executed for a new session
>  
> The handler thread, calls line 66 HiveSessionImplwithUGI
> (UserGroupInformation.createProxyUser(
>   owner,

[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699555#comment-16699555
 ] 

Vihang Karajgaonkar commented on HIVE-20971:


LGTM +1 pending tests

> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-26 Thread Roohi Syeda (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roohi Syeda updated HIVE-20819:
---
Description: 
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed

 

*More Details :* 

When a new query is executed for a new session

 

The handler thread, calls line 66 HiveSessionImplwithUGI

(UserGroupInformation.createProxyUser(owner, 
UserGroupInformation.getLoginUser());

 

At *query compile time*, this sessionUgi is used to open MS connection by 
*handler* thread

Later at *query run time*, line 277 of SQLOperation

Runnable work = new BackgroundWork(getCurrentUGI(), 
parentSession.getSessionHive(), SessionState.get(),asyncPrepare);

 getCurrentUGI(); is used to create a new proxy user, which in turn calls 
Utils.getUGI (see below) and passed to the *Background* thread

 public static UserGroupInformation *getUGI*() throws LoginException, 
IOException {

   String doAs = System.getenv("HADOOP_USER_NAME");

   if(doAs != null && doAs.length() > 0)

{  

   /*  * this allows doAs (proxy user) to be passed along across process 
boundary where  * delegation tokens are not supported.  For example, a DDL 
stmt via WebHCat with  * a doAs parameter, forks to 'hcat' which needs to 
start a Session that  * proxies the end user  */ 

         return UserGroupInformation.createProxyUser(doAs, 
UserGroupInformation.getLoginUser());  

  }

   return UserGroupInformation.getCurrentUser();

 }

 

currentUGI creates a *new* proxyuser instance. This ugi is being set on the 
background thread

And when it is trying to get the Hive db in subsequent calls, we see that since 
the ugi’s are not equal (See the equals code below), a new connection is 
opened, which is never closed, by background thread.

Line 318 in Hive.java

 

 private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean 
isFastCheck,

 boolean doRegisterAllFns) throws HiveException {

   Hive db = hiveDB.get();

   if (db == null || !db.*isCurrentUserOwner*() || needsRefresh

   || (c != null && !isCompatible(db, c, isFastCheck)))

{  db = create(c, false, db, doRegisterAllFns);    }

   if (c != null)

{  db.conf = c;    }

   return db;

 }

 

private boolean isCurrentUserOwner() throws HiveException {

   try

{  return owner == null || 
owner.equals(UserGroupInformation.getCurrentUser());    }

catch(IOException e)

{  throw new HiveException("Error getting current user: " + e.getMessage(), 
e);    }

 }

/**

  * Compare the subjects to see if they are equal to each other.

  */

 @Override

 public boolean *equals*(Object o) {

   if (o == this)

{  return true;    }

else if (o == null || getClass() != o.getClass())

{  return false;    }

else

{  return subject == ((UserGroupInformation) o).subject;    }

 }

 

Solution:

When we assign *currentUGI* to the bg thread, we should call 
UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* 
method listed above (which creates a new instance of proxy user and subject and 
will always return isCurrentUserOwner as false, since both subject and ugi 
instances are different and hence creates a new MS connection)

 

/**

  * Return the current user, including any doAs in the current stack.

  */

 

 public synchronized

 static UserGroupInformation getCurrentUser() throws IOException {

   AccessControlContext context = AccessController.getContext();

   Subject subject = Subject.getSubject(context);

   if (subject == null || subject.getPrincipals(User.class).isEmpty())

{  return getLoginUser();    }

else

{  return new UserGroupInformation(subject);    }

 }

 

 

  was:
Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
set.

The connections created are in ESTABLISHED state and never closed

 

*More Details :* 

When a new query is executed for a new session

 

The handler thread, calls line 66 HiveSessionImplwithUGI

(UserGroupInformation.createProxyUser(

  owner, UserGroupInformation.getLoginUser());

 

At *query compile time*, this sessionUgi is used to open MS connection by 
*handler* thread

Later at *query run time*, line 277 of SQLOperation

Runnable work =

  new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), 
SessionState.get(),

  asyncPrepare);

 

getCurrentUGI(); is used to create a new proxy user, which in turn calls 
Utils.getUGI (see below) and passed to the *Background* thread

 

 public static UserGroupInformation *getUGI*() throws LoginException, 
IOException {

    String doAs = System.getenv("HADOOP_USER_NAME");

    if(doAs != null && doAs.length() > 0) {

 /*

  * this allows doAs (proxy user) to be passed along across process 
boundary where

  * delegation tokens are not

[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Antal Sinkovits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-20440:
---
Attachment: HIVE-20440.14.patch.txt

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch, HIVE-20440.13.patch, HIVE-20440.14.patch.txt
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set

2018-11-26 Thread Roohi Syeda (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699562#comment-16699562
 ] 

Roohi Syeda commented on HIVE-20819:


Logs

 

2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: 
[HiveServer2-Handler-Pool: Thread-54]: Parsing command: drop table empdata

2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: 
[HiveServer2-Handler-Pool: Thread-54]: Parse Completed

2018-10-23 10:52:58,566 INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-54]: Trying to connect to metastore with URI thrift://X:9083

*2018-10-23 10:52:58,567 INFO  hive.metastore: [**HiveServer2-Handler-Pool**: 
Thread-54]: Opened a connection to metastore, current connections: 4*

2018-10-23 10:52:58,569 INFO  hive.metastore: [HiveServer2-Handler-Pool: 
Thread-54]: Connected to metastore.

2018-10-23 10:52:58,698 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Handler-Pool: Thread-54]: Semantic Analysis Completed

2018-10-23 10:52:58,699 INFO  hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: 
Thread-54]: Dumping metastore api call timing information for : compilation 
phase

2018-10-23 10:52:58,699 DEBUG hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: 
Thread-54]: Total time spent in each metastore function (ms): 
\{getTable_(String, String, )=129}

2018-10-23 10:52:58,699 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Handler-Pool: Thread-54]: Completed compiling 
command(queryId=hive_20181023105252_d3247a1c-e343-470b-aa46-a692b5ade414); Time 
taken: 0.183 seconds

2018-10-23 10:52:58,699 DEBUG org.apache.hadoop.security.UserGroupInformation: 
*[HiveServer2-**Handler-Pool**: Thread-54]:* *PrivilegedAction as:hive 
(auth:SIMPLE)* 
*from:org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)*

2018-10-23 10:52:58,699 DEBUG org.apache.hive.service.cli.CLIService: 
[HiveServer2-Handler-Pool: Thread-54]: SessionHandle 
[44f74bd9-1a71-458e-9992-e9d8afc3a958]: executeStatementAsync()

2018-10-23 10:52:58,700 DEBUG org.apache.hadoop.security.UserGroupInformation: 
[HiveServer2-Background-Pool: Thread-56]: PrivilegedAction as:hive (auth:PROXY) 
via hive (auth:SIMPLE) 
from:org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)

2018-10-23 10:52:58,700 DEBUG org.apache.thrift.transport.TSaslTransport: 
[HiveServer2-Handler-Pool: Thread-54]: writing data length: 109

*2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: 
[HiveServer2-Background-Pool: Thread-56]: Creating new db. 
db.isCurrentUserOwner = false*

*2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: 
[HiveServer2-Background-Pool: Thread-56]: Closing current thread's connection 
to Hive Metastore.*

*2018-10-23 10:52:58,715 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Closed a connection to metastore, current connections: 3*

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
[HiveServer2-Background-Pool: Thread-56]: 

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: 
[HiveServer2-Background-Pool: Thread-56]: 

2018-10-23 10:52:58,716 INFO  org.apache.hadoop.hive.ql.Driver: 
[HiveServer2-Background-Pool: Thread-56]: Starting task [Stage-0:DDL] in serial 
mode

2018-10-23 10:52:58,717 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Trying to connect to metastore with URI thrift://:9083

*2018-10-23 10:52:58,717 INFO  hive.metastore: 
[HiveServer2-**Background-Pool**: Thread-56]: Opened a connection to metastore, 
current connections: 4*

2018-10-23 10:52:58,720 INFO  hive.metastore: [HiveServer2-Background-Pool: 
Thread-56]: Connected to metastore.

> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set
> -
>
> Key: HIVE-20819
> URL: https://issues.apache.org/jira/browse/HIVE-20819
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Roohi Syeda
>Assignee: Roohi Syeda
>Priority: Minor
> Attachments: HIVE-20819.1.patch
>
>
> Leaking Metastore connections when HADOOP_USER_NAME environmental variable is 
> set.
> The connections created are in ESTABLISHED state and never closed
>  
> *More Details :* 
> When a new query is executed for a new session
>  
> The handler thread, calls line 66 HiveSessionImplwithUGI
> (UserGroupInformation.createProxyUser(
>   owner, UserGroupInformation.getLoginUser());
>  
> At *query compile time*, this sessionUgi is used to open MS connection by 
> *handler* thread
> Later at *query run time*, line 277 of SQLOperation
> Runnable work =
>   new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), 
> SessionState.get(),
>   asyncPrepare);
>  
> getCurrentUGI(); is used to create a new proxy user, which

[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699559#comment-16699559
 ] 

Vihang Karajgaonkar commented on HIVE-20740:


The test failures occur only on the precommit job. The logs do not have enough 
information to debug these failures. I will try to observer on the ptest server 
itself while the batch containing {{TestObjectStore}} test is running.

Batch 230 has {{TestObjectStore}}
{noformat}
2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: 
UnitTestBatch 
[name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, 
moduleName=standalone-metastore/metastore-server, batchSize=20, 
isParallel=true, testList=[TestMetaStoreConnectionUrlHook, 
TestSchemaToolForMetastore, TestMetastoreSchemaTool, 
TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, 
TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, 
TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, 
TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, 
TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, 
TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, 
TestMarkPartition]]
{noformat}

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699559#comment-16699559
 ] 

Vihang Karajgaonkar edited comment on HIVE-20740 at 11/26/18 8:41 PM:
--

The test failures occur only on the precommit job. The logs do not have enough 
information to debug these failures. I will try to observer on the ptest server 
itself while the batch containing {{TestObjectStore}} test is running.

Batch 230 has {{TestObjectStore}}
2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: 
UnitTestBatch 
[name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, 
moduleName=standalone-metastore/metastore-server, batchSize=20, 
isParallel=true, testList=[TestMetaStoreConnectionUrlHook, 
TestSchemaToolForMetastore, TestMetastoreSchemaTool, 
TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, 
TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, 
TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, 
TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, 
TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, 
TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, 
TestMarkPartition]]


was (Author: vihangk1):
The test failures occur only on the precommit job. The logs do not have enough 
information to debug these failures. I will try to observer on the ptest server 
itself while the batch containing {{TestObjectStore}} test is running.

Batch 230 has {{TestObjectStore}}
{noformat}
2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: 
UnitTestBatch 
[name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, 
moduleName=standalone-metastore/metastore-server, batchSize=20, 
isParallel=true, testList=[TestMetaStoreConnectionUrlHook, 
TestSchemaToolForMetastore, TestMetastoreSchemaTool, 
TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, 
TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, 
TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, 
TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, 
TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, 
TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, 
TestMarkPartition]]
{noformat}

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized

[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader

2018-11-26 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699537#comment-16699537
 ] 

Gopal V commented on HIVE-20932:


[~bslim]: LGTM - +1 

minor nit: there's a new array list allocation for each loop, which seems 
somewhat of a GC thrash for no good reason.

Making a DruidSerdeRow class extending ArrayList would fix that & make 
it less functional, but more allocation friendly.

> Vectorize Druid Storage Handler Reader
> --
>
> Key: HIVE-20932
> URL: https://issues.apache.org/jira/browse/HIVE-20932
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, 
> HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, 
> HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch
>
>
> This patch aims at adding support for vectorize read of data from Druid to 
> Hive.
> [~t3rmin4t0r] suggested that this will improve the performance of the top 
> level operators that supports vectorization.
> As a first cut am just adding a wrapper around the existing Record Reader to 
> read up to 1024 row at a time. 
> Future work will be to avoid going via old reader and convert straight the 
> Json (smile format) to Vector primitive types. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20971:
--
Status: Patch Available  (was: Open)

[~vihangk1]: Could you please review?

Thanks,

Peter

> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20971:
--
Attachment: HIVE-20971.patch

> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20971.patch
>
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699519#comment-16699519
 ] 

Hive QA commented on HIVE-20969:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949524/HIVE-20969.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15539 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15060/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15060/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15060/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949524 - PreCommit-HIVE-Build

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.2.patch, HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-20971:
-


> TestJdbcWithDBTokenStore[*] should both use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
> ---
>
> Key: HIVE-20971
> URL: https://issues.apache.org/jira/browse/HIVE-20971
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The original intent was to use 
> MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException

2018-11-26 Thread Vineet Garg (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699468#comment-16699468
 ] 

Vineet Garg commented on HIVE-20955:


[~bslim] I tried adding this test to druidmini_expressions but I am unable to 
run TestMiniDruidCliDriver tests on my machine. I am running into error:
{noformat}
[ERROR] 
testCliDriver[druidmini_expressions](org.apache.hadoop.hive.cli.TestMiniDruidCliDriver)
  Time elapsed: 9.077 s  <<< FAILURE!
java.lang.AssertionError: Failed during initFromDatasets processLine with code=2
at org.junit.Assert.fail(Assert.java:88)
at org.apache.hadoop.hive.ql.QTestUtil.initDataset(QTestUtil.java:1110)
at 
org.apache.hadoop.hive.ql.QTestUtil.initDataSetForTest(QTestUtil.java:1091)
at org.apache.hadoop.hive.ql.QTestUtil.cliInit(QTestUtil.java:1148)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
at 
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
{noformat}



> Calcite Rule HiveExpandDistinctAggregatesRule seems throwing 
> IndexOutOfBoundsException
> --
>
> Key: HIVE-20955
> URL: https://issues.apache.org/jira/browse/HIVE-20955
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: slim bouguerra
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-20955.1.patch
>
>
>  
> Adde the following query to Druid test  
> ql/src/test/queries/clientpositive/druidmini_expressions.q
> {code}
> select count(distinct `__time`, cint) from (select * from 
> druid_table_alltypesorc) as src;
> {code}
> leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client 
> execution failed with error code = 4 running "\{code}
> with exception stack 
> {code}
> 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293)
>  ~[guava-19.0.jar:?]
>  at 
> com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41)
>  ~[guava-19.0.jar:?]
>  at 
> org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) 
> ~[?:?]
>  at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) 
> ~[?:?]
>  at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>  ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
> ~[calcite-core-1.17.0.jar:1.17.0]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
>

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699470#comment-16699470
 ] 

Hive QA commented on HIVE-20969:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
37s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15060/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15060/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.2.patch, HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
>

[jira] [Updated] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction

2018-11-26 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20775:
---
Attachment: HIVE-20775.06.patch

> Factor cost of each SJ reduction when costing a follow-up reduction
> ---
>
> Key: HIVE-20775
> URL: https://issues.apache.org/jira/browse/HIVE-20775
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, 
> HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, 
> HIVE-20775.06.patch, HIVE-20775.patch
>
>
> Currently, while costing the SJ in a plan, the stats of the a TS that is 
> reduced by a SJ are not adjusted after we have decided to keep a SJ in the 
> tree. Ideally, we could adjust the stats to take into account decisions that 
> have already been made.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20740) Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-20740:
---
Attachment: HIVE-20740.13.patch

> Remove global lock in ObjectStore.setConf method
> 
>
> Key: HIVE-20740
> URL: https://issues.apache.org/jira/browse/HIVE-20740
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, 
> HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, 
> HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, 
> HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch
>
>
> The ObjectStore#setConf method has a global lock which can block other 
> clients in concurrent workloads.
> {code}
> @Override
>   @SuppressWarnings("nls")
>   public void setConf(Configuration conf) {
> // Although an instance of ObjectStore is accessed by one thread, there 
> may
> // be many threads with ObjectStore instances. So the static variables
> // pmf and prop need to be protected with locks.
> pmfPropLock.lock();
> try {
>   isInitialized = false;
>   this.conf = conf;
>   this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, 
> ConfVars.HIVE_TXN_STATS_ENABLED);
>   configureSSL(conf);
>   Properties propsFromConf = getDataSourceProps(conf);
>   boolean propsChanged = !propsFromConf.equals(prop);
>   if (propsChanged) {
> if (pmf != null){
>   clearOutPmfClassLoaderCache(pmf);
>   if (!forTwoMetastoreTesting) {
> // close the underlying connection pool to avoid leaks
> pmf.close();
>   }
> }
> pmf = null;
> prop = null;
>   }
>   assert(!isActiveTransaction());
>   shutdown();
>   // Always want to re-create pm as we don't know if it were created by 
> the
>   // most recent instance of the pmf
>   pm = null;
>   directSql = null;
>   expressionProxy = null;
>   openTrasactionCalls = 0;
>   currentTransaction = null;
>   transactionStatus = TXN_STATUS.NO_STATE;
>   initialize(propsFromConf);
>   String partitionValidationRegex =
>   MetastoreConf.getVar(this.conf, 
> ConfVars.PARTITION_NAME_WHITELIST_PATTERN);
>   if (partitionValidationRegex != null && 
> !partitionValidationRegex.isEmpty()) {
> partitionValidationPattern = 
> Pattern.compile(partitionValidationRegex);
>   } else {
> partitionValidationPattern = null;
>   }
>   // Note, if metrics have not been initialized this will return null, 
> which means we aren't
>   // using metrics.  Thus we should always check whether this is non-null 
> before using.
>   MetricRegistry registry = Metrics.getRegistry();
>   if (registry != null) {
> directSqlErrors = 
> Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS);
>   }
>   this.batchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.RAWSTORE_PARTITION_BATCH_SIZE);
>   if (!isInitialized) {
> throw new RuntimeException(
> "Unable to create persistence manager. Check dss.log for details");
>   } else {
> LOG.debug("Initialized ObjectStore");
>   }
> } finally {
>   pmfPropLock.unlock();
> }
>   }
> {code}
> The {{pmfPropLock}} is a static object and it disallows any other new 
> connection to HMS which is trying to instantiate ObjectStore. We should 
> either remove the lock or reduce the scope of the lock so that it is held for 
> a very small amount of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699448#comment-16699448
 ] 

Hive QA commented on HIVE-20440:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949511/HIVE-20440.13.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15543 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15059/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15059/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15059/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949511 - PreCommit-HIVE-Build

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch, HIVE-20440.13.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699412#comment-16699412
 ] 

Sahil Takiar commented on HIVE-20969:
-

+1 LGTM pending tests.

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.2.patch, HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699399#comment-16699399
 ] 

Hive QA commented on HIVE-20440:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
50s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 
fixed = 54 total (was 56) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15059/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15059/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15059/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch,

[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20969:
--
Status: Patch Available  (was: In Progress)

Fixed test case

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.2.patch, HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20969:
--
Attachment: HIVE-20969.2.patch

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.2.patch, HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699373#comment-16699373
 ] 

Peter Vary commented on HIVE-20969:
---

[~stakiar]: Thanks! Exactly my thoughts. I arrived to similar conclusion after 
some code digging. See the attached proposed patch.

What do you think?

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-20969 started by Peter Vary.
-
> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699361#comment-16699361
 ] 

Hive QA commented on HIVE-20954:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949508/HIVE-20954.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 15542 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=197)
[druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testACIDwithSchemaEvolutionAndCompaction
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidOrcWritePreservesFieldNames
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidWithSchemaEvolution
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAlterTable
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketCodec
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketizedInputFormat
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCleanerForTxnToWriteId
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCompactWithDelete
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDeleteIn
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testETLSplitStrategyForACID
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testEmptyInTblproperties
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFailHeartbeater
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFileSystemUnCaching
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInitiatorWithMultipleFailedCompactions
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite1
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwriteWithSelfJoin
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeWithPredicate
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMmTableCompaction
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsertStatement
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNoHistory
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidInsert
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion02
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion1
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion2
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion3
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOpenTxnsCounter
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcNoPPD
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcPPD
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOriginalFileReaderWhenNonAcidConvertedToAcid
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testUpdateMixedCase
 (batchId=320)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testValidTxnsBookkeeping
 (batchId=320)

[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20969:
--
Attachment: HIVE-20969.patch

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20969.patch
>
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699343#comment-16699343
 ] 

Sahil Takiar commented on HIVE-20969:
-

The intention of HIVE-19008 was so simplify the session id logic in HoS. Before 
HIVE-19008, the HoS session id was a UUID that was completely independent of 
the session id. After HIVE-19008, the HoS session id is a counter that is 
incremented for each each new Spark session created for a given Hive session.

{quote} I would assume that it would be good to connect the spark session to 
the hive session in every log message so it would be good if the sparkSessionId 
would contain the hive session id too. \{quote}

Adding the hive session id into the spark session id sounds like a reasonable 
idea to me. Logically, that is what HIVE-19008 already does. After HIVE-19008, 
any spark session id is globally identifiable by the Hive session id + Spark 
session id. Again, prior to HIVE-19008 the sparkSessionId was a UUID that was 
independent of the hive session id.

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20963) Handle C-Style comments in hive query

2018-11-26 Thread Alan Gates (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699341#comment-16699341
 ] 

Alan Gates commented on HIVE-20963:
---

Zoltan is correct, // is not standard SQL.  And Hive does support /* */ style, 
as can be seen from some of the unit tests that use them, e.g. comment.q

> Handle C-Style comments in hive query
> -
>
> Key: HIVE-20963
> URL: https://issues.apache.org/jira/browse/HIVE-20963
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Shubhangi Pardeshi
>Priority: Major
>
> h3. Problem
> Currently only Std. SQL. style comment i.e. "–" can be used in query. 
> Requesting to add support for C-Style single line as well as multiline 
> comments. 
> 1. /*  */
> 2. /* 
>  */
> 3. //  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699304#comment-16699304
 ] 

Hive QA commented on HIVE-20954:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
46s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 9 new + 22 unchanged - 1 fixed 
= 31 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15058/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15058/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15058/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699254#comment-16699254
 ] 

Hive QA commented on HIVE-20794:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949506/HIVE-20794.06

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15629 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15057/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15057/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15057/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949506 - PreCommit-HIVE-Build

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699233#comment-16699233
 ] 

Hive QA commented on HIVE-20794:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
13s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
2s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
38s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
21s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch standalone-metastore passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 
fixed = 17 total (was 21) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch service passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch util passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
44s{color} | {color:red} service generated 1 new + 48 unchanged - 0 fixed = 49 
total (was 48) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
20s{color} | {color:green} the patch passed {color} |
|| || || ||

[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Antal Sinkovits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-20440:
---
Attachment: HIVE-20440.13.patch

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch, HIVE-20440.13.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Antal Sinkovits (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699154#comment-16699154
 ] 

Antal Sinkovits commented on HIVE-20440:


Test failiure not related. Uploading again.

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch, HIVE-20440.13.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699140#comment-16699140
 ] 

Hive QA commented on HIVE-20440:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949494/HIVE-20440.12.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15548 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=61)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15056/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15056/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15056/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949494 - PreCommit-HIVE-Build

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699066#comment-16699066
 ] 

Teddy Choi commented on HIVE-20954:
---

I can't reproduce it on my laptop. So I'm uploading it again to trigger a build.

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95

2018-11-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20954:
--
Attachment: HIVE-20954.3.patch

> Vector RS operator is not using uniform hash function for TPC-DS query 95
> -
>
> Key: HIVE-20954
> URL: https://issues.apache.org/jira/browse/HIVE-20954
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, 
> HIVE-20954.3.patch
>
>
> Distribution of rows is skewed in DHJ causing slowdown.
> Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator 
> and VectorReduceSinkLongOperator.
> {code}
> | Select Operator|
> |   expressions: ws_warehouse_sk (type: bigint), 
> ws_order_number (type: bigint) |
> |   outputColumnNames: _col0, _col1 |
> |   Select Vectorization:|
> |   className: VectorSelectOperator |
> |   native: true |
> |   projectedOutputColumnNums: [14, 16] |
> |   Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkObjectHashOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | partitionColumnNums: [16] |
> | valueColumnNums: [14]  |
> ++
> |  Explain   |
> ++
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> |   Reduce Output Operator   |
> | key expressions: _col1 (type: bigint) |
> | sort order: +  |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Reduce Sink Vectorization: |
> | className: VectorReduceSinkLongOperator |
> | keyColumnNums: [16]|
> | native: true   |
> | nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true |
> | valueColumnNums: [14]  |
> | Statistics: Num rows: 7199963324 Data size: 
> 115185006696 Basic stats: COMPLETE Column stats: COMPLETE |
> | value expressions: _col0 (type: bigint) |
> | Execution mode: vectorized, llap   |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Attachment: HIVE-20794.06
Status: Patch Available  (was: In Progress)

Patch fixing checkstyle, findbug errors from the previous runs.

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20794:
--
Status: In Progress  (was: Patch Available)

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server down it should deregister itself from 
> Zookeeper, when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from 
> the specified ZooKeeper and treat those as if they were specified in 
> THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to 
> connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699043#comment-16699043
 ] 

Hive QA commented on HIVE-20440:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
43s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 
fixed = 54 total (was 56) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15056/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15056/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15056/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch,

[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Antal Sinkovits (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-20440:
---
Attachment: HIVE-20440.12.patch

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache

2018-11-26 Thread Antal Sinkovits (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698914#comment-16698914
 ] 

Antal Sinkovits commented on HIVE-20440:


Rebase

> Create better cache eviction policy for SmallTableCache
> ---
>
> Key: HIVE-20440
> URL: https://issues.apache.org/jira/browse/HIVE-20440
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
> Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, 
> HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, 
> HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, 
> HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, 
> HIVE-20440.12.patch
>
>
> Enhance the SmallTableCache, to use guava cache with soft references, so that 
> we evict when there is memory pressure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698807#comment-16698807
 ] 

Peter Vary commented on HIVE-20969:
---

My current theory is that HIVE-19008 changed sparkSessionId generation which 
affected scratchDir creation.

[~stakiar]: Could you help out me here? What was the original intention here? I 
would assume that it would be good to connect the spark session to the hive 
session in every log message so it would be good if the sparkSessionId would 
contain the hive session id too. Otherwise when we have multiple HoS queries 
running on the same HS2 instance then we will have hard time differentiating 
between the multiple spark sessions with id="1".

[~ngangam]: Any thoughts on this?

 

> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698778#comment-16698778
 ] 

Hive QA commented on HIVE-20760:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949463/HIVE-20760.8.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 15500 tests 
executed
*Failed tests:*
{noformat}
TestCompactor - did not produce a TEST-*.xml file (likely timed out) 
(batchId=244)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] 
(batchId=182)
org.apache.hadoop.hive.ql.TestTxnCommandsForMmTable.testOperationsOnCompletedTxnComponentsForMmTable
 (batchId=284)
org.apache.hadoop.hive.ql.TestTxnCommandsForOrcMmTable.testOperationsOnCompletedTxnComponentsForMmTable
 (batchId=306)
org.apache.hadoop.hive.ql.TestTxnConcatenate.testConcatenateMM (batchId=293)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15055/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15055/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15055/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949463 - PreCommit-HIVE-Build

> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, 
> HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a significant (~10%) amount of memory wasted due to this 
> duplication. 
> I looked into the problem and found a way to reduce the overhead caused by 
> the multiple HiveConf objects.
> I've created an implementation of Properties, somewhat similar to 
> CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve 
> this problem, because it drops the interned Properties right after we add a 
> new property.
> So my implementation looks like this:
>  * When we create a new HiveConf from an existing one (copy constructor), we 
> change the properties object stored by HiveConf to the new Properties 
> implementation (HiveConfProperties). We have 2 possible way to do this. 
> Either we change the visibility of the properties field in the ancestor class 
> (Configuration which comes from hadoop) to protected, or a simpler way is to 
> just change the type using reflection.
>  * HiveConfProperties instantly intern the given properties. After this, 
> every time we add a new property to HiveConf, we add it to an additional 
> Properties object. This way if we create multiple HiveConf with the same base 
> properties, they will use the same Properties object but each session/task 
> can add its own unique properties.
>  * Getting a property from HiveConfProperties would look like this: (I stored 
> the non-interned properties in super class)
>                 String property=super.getProperty(key);
>                 if (property == null) property= interned.getProperty(key);
>                 return property;
> Running some tests showed that the interning works (with 50 connections to 
> HiveServer2, heapdumps created after sessions are created for queries): 
> Overall memory:
>          original: 34,599K              interned: 20,582K
> Retained memory of HiveConfs:
>         original: 16,366K               interned: 10,804K
> I attach the JXray reports about the heapdumps.
> What are your thoughts about this solution? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS

2018-11-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-20969:
-


> HoS sessionId generation can cause race conditions when uploading files to 
> HDFS
> ---
>
> Key: HIVE-20969
> URL: https://issues.apache.org/jira/browse/HIVE-20969
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The observed exception is:
> {code}
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) 
> [Lease.  Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698693#comment-16698693
 ] 

Hive QA commented on HIVE-20760:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
30s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} common: The patch generated 3 new + 426 unchanged - 0 
fixed = 429 total (was 426) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
38s{color} | {color:red} common generated 3 new + 65 unchanged - 0 fixed = 68 
total (was 65) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:common |
|  |  org.apache.hadoop.hive.common.HiveConfProperties.clone() does not call 
super.clone()  At HiveConfProperties.java: At HiveConfProperties.java:[line 
260] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hive.common.HiveConfProperties.interned; locked 70% of time  
Unsynchronized access at HiveConfProperties.java:70% of time  Unsynchronized 
access at HiveConfProperties.java:[line 108] |
|  |  org.apache.hadoop.hive.common.HiveConfProperties.getProperty(String, 
String) is unsynchronized, 
org.apache.hadoop.hive.common.HiveConfProperties.setProperty(String, String) is 
synchronized  At HiveConfProperties.java:String) is synchronized  At 
HiveConfProperties.java:[lines 123-130] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15055/dev-support/hive-personality.sh
 |
| git revision | master / 0fee288 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus/diff-checkstyle-common.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus/new-findbugs-common.html
 |
| modules | C: common U: common |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, 
> HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, 
>

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698682#comment-16698682
 ] 

Hive QA commented on HIVE-20794:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949455/HIVE-20794.05

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15624 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit]
 (batchId=171)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15054/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15054/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15054/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949455 - PreCommit-HIVE-Build

> Use Zookeeper for metastore service discovery
> -
>
> Key: HIVE-20794
> URL: https://issues.apache.org/jira/browse/HIVE-20794
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, 
> HIVE-20794.03, HIVE-20794.04, HIVE-20794.05
>
>
> Right now, multiple metastore services can be specified in 
> hive.metastore.uris configuration, but that list is static and can not be 
> modified dynamically. Use Zookeeper for dynamic service discovery of 
> metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper 
> parameters directly from HiveConf. The class is changed so that it could be 
> used for both HiveServer2 and Metastore server and works with both the 
> configurations. Following methods from HiveServer2 are now moved into 
> ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # 
> removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper 
> quorum. When THRIFT_SERVICE_DISCOVERY_MODE 
> (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are 
> used as ZooKeeper quorum. When it's set to be empty, the URIs are used to 
> locate the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf 
> counterparts. It looks odd that the Metastore related configurations do not 
> have their macros start with METASTORE, but start with THRIFT. I have just 
> followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE 
> (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT 
> (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - 
> (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - 
> THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES 
> (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - 
> THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME 
> (hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host 
> address to bind Metastore service to. Right now Metastore binds to *, i.e all 
> addresses. Metastore doesn't then know which of those addresses it should add 
> to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this 
> configuration is specified the metastore server binds to that address and 
> also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and 
> seem irrelevant for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, 
> HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when 
> configured.
>  # When shutting a metastore server

[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery

2018-11-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698670#comment-16698670
 ] 

Hive QA commented on HIVE-20794:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
15s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
5s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/util in master has 48 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
14s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch standalone-metastore passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} 
|
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 
fixed = 17 total (was 21) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} service: The patch generated 3 new + 35 unchanged - 0 
fixed = 38 total (was 35) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch util passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
43s{color} | {color:red} service generated 1 new + 48 unchanged - 0 fixed = 49 
total (was 48) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
9s{color} | {color:green} the patch

[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-26 Thread Barnabas Maidics (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20760:

Status: Open  (was: Patch Available)

> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, 
> HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a significant (~10%) amount of memory wasted due to this 
> duplication. 
> I looked into the problem and found a way to reduce the overhead caused by 
> the multiple HiveConf objects.
> I've created an implementation of Properties, somewhat similar to 
> CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve 
> this problem, because it drops the interned Properties right after we add a 
> new property.
> So my implementation looks like this:
>  * When we create a new HiveConf from an existing one (copy constructor), we 
> change the properties object stored by HiveConf to the new Properties 
> implementation (HiveConfProperties). We have 2 possible way to do this. 
> Either we change the visibility of the properties field in the ancestor class 
> (Configuration which comes from hadoop) to protected, or a simpler way is to 
> just change the type using reflection.
>  * HiveConfProperties instantly intern the given properties. After this, 
> every time we add a new property to HiveConf, we add it to an additional 
> Properties object. This way if we create multiple HiveConf with the same base 
> properties, they will use the same Properties object but each session/task 
> can add its own unique properties.
>  * Getting a property from HiveConfProperties would look like this: (I stored 
> the non-interned properties in super class)
>                 String property=super.getProperty(key);
>                 if (property == null) property= interned.getProperty(key);
>                 return property;
> Running some tests showed that the interning works (with 50 connections to 
> HiveServer2, heapdumps created after sessions are created for queries): 
> Overall memory:
>          original: 34,599K              interned: 20,582K
> Retained memory of HiveConfs:
>         original: 16,366K               interned: 10,804K
> I attach the JXray reports about the heapdumps.
> What are your thoughts about this solution? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs

2018-11-26 Thread Barnabas Maidics (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20760:

Attachment: HIVE-20760.8.patch
Status: Patch Available  (was: Open)

> Reducing memory overhead due to multiple HiveConfs
> --
>
> Key: HIVE-20760
> URL: https://issues.apache.org/jira/browse/HIVE-20760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Barnabas Maidics
>Assignee: Barnabas Maidics
>Priority: Major
> Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, 
> HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, 
> HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, 
> hiveconf_interned.html, hiveconf_original.html
>
>
> The issue is that every Hive task has to load its own version of 
> {{HiveConf}}. When running with a large number of cores per executor (HoS), 
> there is a significant (~10%) amount of memory wasted due to this 
> duplication. 
> I looked into the problem and found a way to reduce the overhead caused by 
> the multiple HiveConf objects.
> I've created an implementation of Properties, somewhat similar to 
> CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve 
> this problem, because it drops the interned Properties right after we add a 
> new property.
> So my implementation looks like this:
>  * When we create a new HiveConf from an existing one (copy constructor), we 
> change the properties object stored by HiveConf to the new Properties 
> implementation (HiveConfProperties). We have 2 possible way to do this. 
> Either we change the visibility of the properties field in the ancestor class 
> (Configuration which comes from hadoop) to protected, or a simpler way is to 
> just change the type using reflection.
>  * HiveConfProperties instantly intern the given properties. After this, 
> every time we add a new property to HiveConf, we add it to an additional 
> Properties object. This way if we create multiple HiveConf with the same base 
> properties, they will use the same Properties object but each session/task 
> can add its own unique properties.
>  * Getting a property from HiveConfProperties would look like this: (I stored 
> the non-interned properties in super class)
>                 String property=super.getProperty(key);
>                 if (property == null) property= interned.getProperty(key);
>                 return property;
> Running some tests showed that the interning works (with 50 connections to 
> HiveServer2, heapdumps created after sessions are created for queries): 
> Overall memory:
>          original: 34,599K              interned: 20,582K
> Retained memory of HiveConfs:
>         original: 16,366K               interned: 10,804K
> I attach the JXray reports about the heapdumps.
> What are your thoughts about this solution? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-26 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-20330:
--
Status: In Progress  (was: Patch Available)

> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, 
> HIVE-20330.2.patch
>
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-26 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-20330:
--
Status: Patch Available  (was: In Progress)

> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, 
> HIVE-20330.2.patch
>
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-26 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-20330:
--
Status: In Progress  (was: Patch Available)

> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, 
> HIVE-20330.2.patch
>
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-26 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-20330:
--
Attachment: (was: HIVE-20330.2.patch)

> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch
>
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-26 Thread Adam Szita (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-20330:
--
Attachment: HIVE-20330.2.patch

> HCatLoader cannot handle multiple InputJobInfo objects for a job with 
> multiple inputs
> -
>
> Key: HIVE-20330
> URL: https://issues.apache.org/jira/browse/HIVE-20330
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, 
> HIVE-20330.2.patch
>
>
> While running performance tests on Pig (0.12 and 0.17) we've observed a huge 
> performance drop in a workload that has multiple inputs from HCatLoader.
> The reason is that for a particular MR job with multiple Hive tables as 
> input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance 
> but only one table's information (InputJobInfo instance) gets tracked in the 
> JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}).
> Any such call overwrites preexisting values, and thus only the last table's 
> information will be considered when Pig calls {{getStatistics}} to calculate 
> and estimate required reducer count.
> In cases when there are 2 input tables, 256GB and 1MB in size respectively, 
> Pig will query the size information from HCat for both of them, but it will 
> either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the 
> execution plan's DAG.
> It should of course see 256.00097GB in total and use 257 reducers by default 
> accordingly.
> In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle 
> with the actual 256.00097GB...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

95 matches

Mail list logo