[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20971: -- Attachment: HIVE-20971.2.patch > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.2.patch, HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700013#comment-16700013 ] Vihang Karajgaonkar commented on HIVE-20740: Finally a green run. [~asherman] Updated the RB with the latest patch. There is no real code change since you last reviewed on the RB except for the fact that I rebased and was juggling through many unrelated failures on precommit. > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, > HIVE-20740.14.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Attachment: HIVE-20794.07 Status: Patch Available (was: In Progress) Out of the two failures reported the first failure TestCliDriver.testCliDriver[vector_groupby_reduce] is failing for previous runs as well. The diff there's because of floating point rounding error and is unrelated to changes in this patch. I am running the test for the second failure locally and it has not failed for me. Hence re-attaching the patch to kick ptest again. > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07, > HIVE-20794.07 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Status: In Progress (was: Patch Available) > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699958#comment-16699958 ] Hive QA commented on HIVE-20794: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949603/HIVE-20794.07 {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15627 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=196) [druidmini_dynamic_partition.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_test_insert.q,druidkafkamini_delimited.q] TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=197) [druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=61) org.apache.hive.jdbc.TestActivePassiveHA.testConnectionActivePassiveHAServiceDiscovery (batchId=259) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15068/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15068/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15068/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949603 - PreCommit-HIVE-Build > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > #
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699950#comment-16699950 ] Hive QA commented on HIVE-20794: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 52s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 15s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 5s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 43s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 43s{color} | {color:blue} itests/util in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 7s{color} | {color:green} The patch standalone-metastore passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 fixed = 17 total (was 21) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch service passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch hive-unit passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch util passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || |
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Attachment: HIVE-20794.07 Status: Patch Available (was: In Progress) Fixed findbug notice in the last run and updated PR. > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06, HIVE-20794.07 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Status: In Progress (was: Patch Available) > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699886#comment-16699886 ] Hive QA commented on HIVE-20740: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949566/HIVE-20740.14.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15543 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15066/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15066/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15066/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12949566 - PreCommit-HIVE-Build > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, > HIVE-20740.14.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699869#comment-16699869 ] Hive QA commented on HIVE-20740: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 4s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 46s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 608 unchanged - 0 fixed = 609 total (was 608) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} standalone-metastore/metastore-server generated 1 new + 183 unchanged - 2 fixed = 184 total (was 185) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:standalone-metastore/metastore-server | | | org.apache.hadoop.hive.metastore.PersistenceManagerProvider.updatePmfProperties(Configuration) does not release lock on all paths At PersistenceManagerProvider.java:on all paths At PersistenceManagerProvider.java:[line 152] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15066/dev-support/hive-personality.sh | | git revision | master / 56625f3 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus/diff-checkstyle-itests_hive-unit.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus/new-findbugs-standalone-metastore_metastore-server.html | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15066/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >
[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-20954: -- Fix Version/s: 4.0.0 Pushed to master > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, > HIVE-20954.3.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkObjectHashOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | partitionColumnNums: [16] | > | valueColumnNums: [14] | > ++ > | Explain | > ++ > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkLongOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | valueColumnNums: [14] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Execution mode: vectorized, llap | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-20954: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, > HIVE-20954.3.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkObjectHashOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | partitionColumnNums: [16] | > | valueColumnNums: [14] | > ++ > | Explain | > ++ > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkLongOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | valueColumnNums: [14] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Execution mode: vectorized, llap | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699828#comment-16699828 ] slim bouguerra commented on HIVE-20955: --- [~vgarg] +1 works fine, we can add the tests later. Please merge the The patch when you can. > Calcite Rule HiveExpandDistinctAggregatesRule seems throwing > IndexOutOfBoundsException > -- > > Key: HIVE-20955 > URL: https://issues.apache.org/jira/browse/HIVE-20955 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: slim bouguerra >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-20955.1.patch > > > > Adde the following query to Druid test > ql/src/test/queries/clientpositive/druidmini_expressions.q > {code} > select count(distinct `__time`, cint) from (select * from > druid_table_alltypesorc) as src; > {code} > leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client > execution failed with error code = 4 running "\{code} > with exception stack > {code} > 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] > parse.CalcitePlanner: CBO failed, skipping CBO. > java.lang.IndexOutOfBoundsException: index (1) must be less than size (1) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310) > ~[guava-19.0.jar:?] > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293) > ~[guava-19.0.jar:?] > at > com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41) > ~[guava-19.0.jar:?] > at > org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77) > ~[calcite-core-1.17.0.jar:1.17.0] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) > ~[?:?] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) > ~[?:?] > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:478) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12296) >
[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699825#comment-16699825 ] Hive QA commented on HIVE-20936: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949565/HIVE-20936.6.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15539 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15065/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15065/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15065/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12949565 - PreCommit-HIVE-Build > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, > HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699819#comment-16699819 ] Teddy Choi commented on HIVE-20954: --- The failures were not reproduced, and new TestTxnCommands2WithSplitUpdateAndVectorization failures seems not related. I will push it to master. Thanks. > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, > HIVE-20954.3.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkObjectHashOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | partitionColumnNums: [16] | > | valueColumnNums: [14] | > ++ > | Explain | > ++ > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkLongOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | valueColumnNums: [14] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Execution mode: vectorized, llap | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699814#comment-16699814 ] Hive QA commented on HIVE-20936: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 50s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 15s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 2s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 26s{color} | {color:blue} hcatalog/streaming in master has 11 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} streaming in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 4s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s{color} | {color:red} ql: The patch generated 2 new + 639 unchanged - 5 fixed = 641 total (was 644) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 173 unchanged - 0 fixed = 174 total (was 173) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 105 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 53s{color} | {color:red} ql generated 3 new + 2311 unchanged - 1 fixed = 2314 total (was 2312) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 49s{color} | {color:red} standalone-metastore_metastore-common generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 39s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Field MetaStoreCompactorThread.threadId masks field in superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java:superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java | | | Field MetaStoreCompactorThread.rs masks field in superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java:superclass org.apache.hadoop.hive.ql.txn.compactor.CompactorThread In MetaStoreCompactorThread.java | | |
[jira] [Commented] (HIVE-20930) VectorCoalesce in FILTER mode doesn't take effect
[ https://issues.apache.org/jira/browse/HIVE-20930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699810#comment-16699810 ] Ashutosh Chauhan commented on HIVE-20930: - +1 > VectorCoalesce in FILTER mode doesn't take effect > - > > Key: HIVE-20930 > URL: https://issues.apache.org/jira/browse/HIVE-20930 > Project: Hive > Issue Type: Bug >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-20930.1.patch, HIVE-20930.2.patch, > HIVE-20930.3.patch > > > HIVE-20277 fixed vectorized case expressions for FILTER, but VectorCoalesce > is still not fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699808#comment-16699808 ] slim bouguerra commented on HIVE-20955: --- [~vgarg] worked for me [https://github.com/b-slim/hive/commit/34ed4421dc674cd0c487cc20eb6c47cd6f629495] {code} cd itests/qtest mvn clean test -Dtest=TestMiniDruidCliDriver -Djava.net.preferIPv4Stack=true -Dtest.output.overwrite=true -s ~/.m2/settings.xml -Dqfile=druidmini_expressions.q {code} > Calcite Rule HiveExpandDistinctAggregatesRule seems throwing > IndexOutOfBoundsException > -- > > Key: HIVE-20955 > URL: https://issues.apache.org/jira/browse/HIVE-20955 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: slim bouguerra >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-20955.1.patch > > > > Adde the following query to Druid test > ql/src/test/queries/clientpositive/druidmini_expressions.q > {code} > select count(distinct `__time`, cint) from (select * from > druid_table_alltypesorc) as src; > {code} > leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client > execution failed with error code = 4 running "\{code} > with exception stack > {code} > 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] > parse.CalcitePlanner: CBO failed, skipping CBO. > java.lang.IndexOutOfBoundsException: index (1) must be less than size (1) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310) > ~[guava-19.0.jar:?] > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293) > ~[guava-19.0.jar:?] > at > com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41) > ~[guava-19.0.jar:?] > at > org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77) > ~[calcite-core-1.17.0.jar:1.17.0] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) > ~[?:?] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) > ~[?:?] > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2031) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1780) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1680) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1439) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699804#comment-16699804 ] Vihang Karajgaonkar commented on HIVE-20860: Patch merged into master. Thanks [~jcamachorodriguez] for the review. > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20860: --- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Fix For: 4.0.0 > > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699791#comment-16699791 ] Hive QA commented on HIVE-20440: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949543/HIVE-20440.14.patch.txt {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15548 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate] (batchId=162) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15064/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15064/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15064/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949543 - PreCommit-HIVE-Build > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch, HIVE-20440.13.patch, HIVE-20440.14.patch.txt > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699755#comment-16699755 ] Hive QA commented on HIVE-20440: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 49s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 fixed = 54 total (was 56) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 total (was 2312) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} hive-unit in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile xml | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15064/dev-support/hive-personality.sh | | git revision | master / e986fc5 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15064/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15064/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch,
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699744#comment-16699744 ] Vihang Karajgaonkar commented on HIVE-20860: Created HIVE-20972 to enable the test again. > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699737#comment-16699737 ] Hive QA commented on HIVE-20971: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949541/HIVE-20971.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 15540 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cli_print_escape_crlf] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer15] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_alter_list_bucketing_table1] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_escape] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_3] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_part6] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_4] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nonreserved_keywords_insert_into1] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge9] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parallel] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_condition_remover] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd2] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[schema_evol_par_vec_table_dictionary_encoding] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[set_variable_sub] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoin_mapjoin2] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoinopt8] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_truncate] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_bitwise_not] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_bucket] (batchId=28) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] (batchId=171) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15063/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15063/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15063/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949541 - PreCommit-HIVE-Build > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction
[ https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20775: --- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Pushed to master, thanks [~ashutoshc]! > Factor cost of each SJ reduction when costing a follow-up reduction > --- > > Key: HIVE-20775 > URL: https://issues.apache.org/jira/browse/HIVE-20775 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, > HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, > HIVE-20775.06.patch, HIVE-20775.patch > > > Currently, while costing the SJ in a plan, the stats of the a TS that is > reduced by a SJ are not adjusted after we have decided to keep a SJ in the > tree. Ideally, we could adjust the stats to take into account decisions that > have already been made. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699711#comment-16699711 ] Jesus Camacho Rodriguez commented on HIVE-20860: +1 [~vihangk1], can we create the follow-up to enable it again in the future? Thanks > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699707#comment-16699707 ] Vihang Karajgaonkar commented on HIVE-20860: [~jcamachorodriguez] [~pvary] Can you please review? > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20860: --- Status: Patch Available (was: Open) > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20860: --- Attachment: HIVE-20860.01.patch > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > HIVE-20860.01.patch, hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20740: --- Attachment: HIVE-20740.14.patch > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch, > HIVE-20740.14.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-20860: -- Assignee: Vihang Karajgaonkar > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-20936: --- Status: Open (was: Patch Available) > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, > HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699685#comment-16699685 ] Vihang Karajgaonkar commented on HIVE-20860: If you look at the history of this test https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_cbo_limit_/history/ this test fails frequently every few builds. I am not really sure what causes the issue. I tried to reproduce it in atleast two different environments but it worked both the times. I also tried to run the test on the exact node of the precommit servers to see if there is something with that environment. But it worked there too. > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20936) Allow the Worker thread in the metastore to run outside of it
[ https://issues.apache.org/jira/browse/HIVE-20936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaume M updated HIVE-20936: --- Attachment: HIVE-20936.6.patch Status: Patch Available (was: Open) > Allow the Worker thread in the metastore to run outside of it > - > > Key: HIVE-20936 > URL: https://issues.apache.org/jira/browse/HIVE-20936 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Attachments: HIVE-20936.1.patch, HIVE-20936.2.patch, > HIVE-20936.3.patch, HIVE-20936.4.patch, HIVE-20936.5.patch, HIVE-20936.6.patch > > > Currently the Worker thread in the metastore in bounded to the metastore, > mainly because of the TxnHandler that it has. This thread runs some map > reduce jobs which may not only be an option wherever the metastore is > running. A solution for this can be to run this thread in HS2 depending on a > flag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699677#comment-16699677 ] Hive QA commented on HIVE-20971: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 10m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15063/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | modules | C: itests/hive-minikdc U: itests/hive-minikdc | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15063/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction
[ https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699662#comment-16699662 ] Hive QA commented on HIVE-20775: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949535/HIVE-20775.06.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15539 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15062/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15062/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15062/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12949535 - PreCommit-HIVE-Build > Factor cost of each SJ reduction when costing a follow-up reduction > --- > > Key: HIVE-20775 > URL: https://issues.apache.org/jira/browse/HIVE-20775 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, > HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, > HIVE-20775.06.patch, HIVE-20775.patch > > > Currently, while costing the SJ in a plan, the stats of the a TS that is > reduced by a SJ are not adjusted after we have decided to keep a SJ in the > tree. Ideally, we could adjust the stats to take into account decisions that > have already been made. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699624#comment-16699624 ] Vihang Karajgaonkar commented on HIVE-20740: cbo_limit.q failure is unrelated to this patch. The failure message is weird, could be extra space somewhere in the q.out [INFO] Running org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver [ERROR] Tests run: 30, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 405.018 s <<< FAILURE! - in org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver [ERROR] testCliDriver[cbo_limit](org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver) Time elapsed: 9.876 s <<< FAILURE! java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing cbo_limit.q 11c11 < 14 2 --- > 14 2 > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
[ https://issues.apache.org/jira/browse/HIVE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699629#comment-16699629 ] Vihang Karajgaonkar commented on HIVE-20860: Failed again here https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport/junit/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_cbo_limit_/ > Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] > -- > > Key: HIVE-20860 > URL: https://issues.apache.org/jira/browse/HIVE-20860 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Priority: Minor > Attachments: > 182-TestMiniLlapLocalCliDriver-vector_udf_adaptor_1.q-schema_evol_text_vec_part_llap_io.q-join_is_not_distinct_from.q-and-27-more.txt, > hive.log.gz, maven-test.txt > > > Test failed in one of the precommit job. Looks like there is some case where > there is additonal space in the diff > {noformat} > Error Message > Client Execution succeeded but contained differences (error code = 1) after > executing cbo_limit.q > 11c11 > < 1 4 2 > --- > > 1 4 2 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction
[ https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699620#comment-16699620 ] Hive QA commented on HIVE-20775: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 45s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 3 new + 123 unchanged - 2 fixed = 126 total (was 125) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 1s{color} | {color:red} ql generated 2 new + 2310 unchanged - 2 fixed = 2312 total (was 2312) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | Dead store to tsRowSize in org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator, ExprNodeDesc, Statistics, ExprNodeDesc) At TezCompiler.java:org.apache.hadoop.hive.ql.parse.TezCompiler.getBloomFilterBenefit(SelectOperator, ExprNodeDesc, Statistics, ExprNodeDesc) At TezCompiler.java:[line 1456] | | | Should org.apache.hadoop.hive.ql.parse.TezCompiler$SemijoinOperatorInfo be a _static_ inner class? At TezCompiler.java:inner class? At TezCompiler.java:[lines 1661-1675] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15062/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus/new-findbugs-ql.html | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15062/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Factor cost of each SJ reduction when costing a follow-up reduction > --- > > Key: HIVE-20775 > URL: https://issues.apache.org/jira/browse/HIVE-20775 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, > HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, > HIVE-20775.06.patch, HIVE-20775.patch > > > Currently, while costing the SJ in a plan, the stats of the a TS that is > reduced by a SJ are not adjusted after we have decided to keep a SJ in the > tree. Ideally, we could adjust the stats to take
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699601#comment-16699601 ] Hive QA commented on HIVE-20740: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949531/HIVE-20740.13.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15540 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (batchId=182) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15061/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15061/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15061/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949531 - PreCommit-HIVE-Build > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set
[ https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roohi Syeda updated HIVE-20819: --- Description: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed *More Details :* When a new query is executed for a new session The handler thread, calls line 66 HiveSessionImplwithUGI (UserGroupInformation.createProxyUser(owner, UserGroupInformation.getLoginUser()); At *query compile time*, this sessionUgi is used to open MS connection by *handler* thread Later at *query run time*, line 277 of SQLOperation Runnable work = new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), SessionState.get(),asyncPrepare); getCurrentUGI(); is used to create a new proxy user, which in turn calls Utils.getUGI (see below) and passed to the *Background* thread {code:java} public static UserGroupInformation getUGI() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { /* * this allows doAs (proxy user) to be passed along across process boundary where * delegation tokens are not supported. For example, a DDL stmt via WebHCat with * a doAs parameter, forks to 'hcat' which needs to start a Session that * proxies the end user */ return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser()); } return UserGroupInformation.getCurrentUser(); } {code} currentUGI creates a *new* proxyuser instance. This ugi is being set on the background thread And when it is trying to get the Hive db in subsequent calls, we see that since the ugi’s are not equal (See the equals code below), a new connection is opened, which is never closed, by background thread. Line 318 in Hive.java {code:java} private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean isFastCheck, boolean doRegisterAllFns) throws HiveException { Hive db = hiveDB.get(); if (db == null || !db.isCurrentUserOwner() || needsRefresh || (c != null && !isCompatible(db, c, isFastCheck))) { db = create(c, false, db, doRegisterAllFns); } if (c != null) { db.conf = c; } return db; } private boolean isCurrentUserOwner() throws HiveException { try { return owner == null || owner.equals(UserGroupInformation.getCurrentUser()); } catch(IOException e) { throw new HiveException("Error getting current user: " + e.getMessage(), e); } } /** * Compare the subjects to see if they are equal to each other. */ @Override public boolean equals(Object o) { if (o == this) { return true; } else if (o == null || getClass() != o.getClass()) { return false; } else { return subject == ((UserGroupInformation) o).subject; } } {code} Solution: When we assign *currentUGI* to the bg thread, we should call UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* method listed above (which creates a new instance of proxy user and subject and will always return isCurrentUserOwner as false, since both subject and ugi instances are different and hence creates a new MS connection) {code:java} /** * Return the current user, including any doAs in the current stack. */ public synchronized static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); Subject subject = Subject.getSubject(context); if (subject == null || subject.getPrincipals(User.class).isEmpty()) { return getLoginUser(); } else { return new UserGroupInformation(subject); } } {code} was: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed *More Details :* When a new query is executed for a new session The handler thread, calls line 66 HiveSessionImplwithUGI (UserGroupInformation.createProxyUser(owner, UserGroupInformation.getLoginUser()); At *query compile time*, this sessionUgi is used to open MS connection by *handler* thread Later at *query run time*, line 277 of SQLOperation Runnable work = new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), SessionState.get(),asyncPrepare); getCurrentUGI(); is used to create a new proxy user, which in turn calls Utils.getUGI (see below) and passed to the *Background* thread public static UserGroupInformation *getUGI*() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { /* * this allows doAs (proxy user) to be passed along across process boundary where * delegation tokens are not
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699576#comment-16699576 ] Hive QA commented on HIVE-20740: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 6s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 608 unchanged - 0 fixed = 609 total (was 608) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 17s{color} | {color:red} standalone-metastore/metastore-server generated 1 new + 183 unchanged - 2 fixed = 184 total (was 185) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 1s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:standalone-metastore/metastore-server | | | org.apache.hadoop.hive.metastore.PersistenceManagerProvider.updatePmfProperties(Configuration) does not release lock on all paths At PersistenceManagerProvider.java:on all paths At PersistenceManagerProvider.java:[line 152] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15061/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus/diff-checkstyle-itests_hive-unit.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus/new-findbugs-standalone-metastore_metastore-server.html | | modules | C: standalone-metastore/metastore-server ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15061/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >
[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set
[ https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roohi Syeda updated HIVE-20819: --- Description: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed *More Details :* When a new query is executed for a new session The handler thread, calls line 66 HiveSessionImplwithUGI (UserGroupInformation.createProxyUser( owner, UserGroupInformation.getLoginUser()); At *query compile time*, this sessionUgi is used to open MS connection by *handler* thread Later at *query run time*, line 277 of SQLOperation Runnable work = new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), SessionState.get(), asyncPrepare); getCurrentUGI(); is used to create a new proxy user, which in turn calls Utils.getUGI (see below) and passed to the *Background* thread public static UserGroupInformation *getUGI*() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { /* * this allows doAs (proxy user) to be passed along across process boundary where * delegation tokens are not supported. For example, a DDL stmt via WebHCat with * a doAs parameter, forks to 'hcat' which needs to start a Session that * proxies the end user */ return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser()); } return UserGroupInformation.getCurrentUser(); } currentUGI creates a *new* proxyuser instance. This ugi is being set on the background thread And when it is trying to get the Hive db in subsequent calls, we see that since the ugi’s are not equal (See the equals code below), a new connection is opened, which is never closed, by background thread. Line 318 in Hive.java private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean isFastCheck, boolean doRegisterAllFns) throws HiveException { Hive db = hiveDB.get(); if (db == null || !db.*isCurrentUserOwner*() || needsRefresh || (c != null && !isCompatible(db, c, isFastCheck))) { db = create(c, false, db, doRegisterAllFns); } if (c != null) { db.conf = c; } return db; } private boolean isCurrentUserOwner() throws HiveException { try { return owner == null || owner.equals(UserGroupInformation.getCurrentUser()); } catch(IOException e) { throw new HiveException("Error getting current user: " + e.getMessage(), e); } } /** * Compare the subjects to see if they are equal to each other. */ @Override public boolean *equals*(Object o) { if (o == this) { return true; } else if (o == null || getClass() != o.getClass()) { return false; } else { return subject == ((UserGroupInformation) o).subject; } } Solution: When we assign *currentUGI* to the bg thread, we should call UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* method listed above (which creates a new instance of proxy user and subject and will always return isCurrentUserOwner as false, since both subject and ugi instances are different and hence creates a new MS connection) /** * Return the current user, including any doAs in the current stack. */ public synchronized static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); Subject subject = Subject.getSubject(context); if (subject == null || subject.getPrincipals(User.class).isEmpty()) { return getLoginUser(); } else { return new UserGroupInformation(subject); } } was: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed > Leaking Metastore connections when HADOOP_USER_NAME environmental variable is > set > - > > Key: HIVE-20819 > URL: https://issues.apache.org/jira/browse/HIVE-20819 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Roohi Syeda >Assignee: Roohi Syeda >Priority: Minor > Attachments: HIVE-20819.1.patch > > > Leaking Metastore connections when HADOOP_USER_NAME environmental variable is > set. > The connections created are in ESTABLISHED state and never closed > > *More Details :* > When a new query is executed for a new session > > The handler thread, calls line 66 HiveSessionImplwithUGI > (UserGroupInformation.createProxyUser( > owner,
[jira] [Commented] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699555#comment-16699555 ] Vihang Karajgaonkar commented on HIVE-20971: LGTM +1 pending tests > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set
[ https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roohi Syeda updated HIVE-20819: --- Description: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed *More Details :* When a new query is executed for a new session The handler thread, calls line 66 HiveSessionImplwithUGI (UserGroupInformation.createProxyUser(owner, UserGroupInformation.getLoginUser()); At *query compile time*, this sessionUgi is used to open MS connection by *handler* thread Later at *query run time*, line 277 of SQLOperation Runnable work = new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), SessionState.get(),asyncPrepare); getCurrentUGI(); is used to create a new proxy user, which in turn calls Utils.getUGI (see below) and passed to the *Background* thread public static UserGroupInformation *getUGI*() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { /* * this allows doAs (proxy user) to be passed along across process boundary where * delegation tokens are not supported. For example, a DDL stmt via WebHCat with * a doAs parameter, forks to 'hcat' which needs to start a Session that * proxies the end user */ return UserGroupInformation.createProxyUser(doAs, UserGroupInformation.getLoginUser()); } return UserGroupInformation.getCurrentUser(); } currentUGI creates a *new* proxyuser instance. This ugi is being set on the background thread And when it is trying to get the Hive db in subsequent calls, we see that since the ugi’s are not equal (See the equals code below), a new connection is opened, which is never closed, by background thread. Line 318 in Hive.java private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean isFastCheck, boolean doRegisterAllFns) throws HiveException { Hive db = hiveDB.get(); if (db == null || !db.*isCurrentUserOwner*() || needsRefresh || (c != null && !isCompatible(db, c, isFastCheck))) { db = create(c, false, db, doRegisterAllFns); } if (c != null) { db.conf = c; } return db; } private boolean isCurrentUserOwner() throws HiveException { try { return owner == null || owner.equals(UserGroupInformation.getCurrentUser()); } catch(IOException e) { throw new HiveException("Error getting current user: " + e.getMessage(), e); } } /** * Compare the subjects to see if they are equal to each other. */ @Override public boolean *equals*(Object o) { if (o == this) { return true; } else if (o == null || getClass() != o.getClass()) { return false; } else { return subject == ((UserGroupInformation) o).subject; } } Solution: When we assign *currentUGI* to the bg thread, we should call UserGroupInformation.getCurrentUser() (see below) instead of calling *getUGI* method listed above (which creates a new instance of proxy user and subject and will always return isCurrentUserOwner as false, since both subject and ugi instances are different and hence creates a new MS connection) /** * Return the current user, including any doAs in the current stack. */ public synchronized static UserGroupInformation getCurrentUser() throws IOException { AccessControlContext context = AccessController.getContext(); Subject subject = Subject.getSubject(context); if (subject == null || subject.getPrincipals(User.class).isEmpty()) { return getLoginUser(); } else { return new UserGroupInformation(subject); } } was: Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set. The connections created are in ESTABLISHED state and never closed *More Details :* When a new query is executed for a new session The handler thread, calls line 66 HiveSessionImplwithUGI (UserGroupInformation.createProxyUser( owner, UserGroupInformation.getLoginUser()); At *query compile time*, this sessionUgi is used to open MS connection by *handler* thread Later at *query run time*, line 277 of SQLOperation Runnable work = new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), SessionState.get(), asyncPrepare); getCurrentUGI(); is used to create a new proxy user, which in turn calls Utils.getUGI (see below) and passed to the *Background* thread public static UserGroupInformation *getUGI*() throws LoginException, IOException { String doAs = System.getenv("HADOOP_USER_NAME"); if(doAs != null && doAs.length() > 0) { /* * this allows doAs (proxy user) to be passed along across process boundary where * delegation tokens are not
[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Sinkovits updated HIVE-20440: --- Attachment: HIVE-20440.14.patch.txt > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch, HIVE-20440.13.patch, HIVE-20440.14.patch.txt > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20819) Leaking Metastore connections when HADOOP_USER_NAME environmental variable is set
[ https://issues.apache.org/jira/browse/HIVE-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699562#comment-16699562 ] Roohi Syeda commented on HIVE-20819: Logs 2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: [HiveServer2-Handler-Pool: Thread-54]: Parsing command: drop table empdata 2018-10-23 10:52:58,516 DEBUG hive.ql.parse.ParseDriver: [HiveServer2-Handler-Pool: Thread-54]: Parse Completed 2018-10-23 10:52:58,566 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-54]: Trying to connect to metastore with URI thrift://X:9083 *2018-10-23 10:52:58,567 INFO hive.metastore: [**HiveServer2-Handler-Pool**: Thread-54]: Opened a connection to metastore, current connections: 4* 2018-10-23 10:52:58,569 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-54]: Connected to metastore. 2018-10-23 10:52:58,698 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Handler-Pool: Thread-54]: Semantic Analysis Completed 2018-10-23 10:52:58,699 INFO hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: Thread-54]: Dumping metastore api call timing information for : compilation phase 2018-10-23 10:52:58,699 DEBUG hive.ql.metadata.Hive: [HiveServer2-Handler-Pool: Thread-54]: Total time spent in each metastore function (ms): \{getTable_(String, String, )=129} 2018-10-23 10:52:58,699 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Handler-Pool: Thread-54]: Completed compiling command(queryId=hive_20181023105252_d3247a1c-e343-470b-aa46-a692b5ade414); Time taken: 0.183 seconds 2018-10-23 10:52:58,699 DEBUG org.apache.hadoop.security.UserGroupInformation: *[HiveServer2-**Handler-Pool**: Thread-54]:* *PrivilegedAction as:hive (auth:SIMPLE)* *from:org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)* 2018-10-23 10:52:58,699 DEBUG org.apache.hive.service.cli.CLIService: [HiveServer2-Handler-Pool: Thread-54]: SessionHandle [44f74bd9-1a71-458e-9992-e9d8afc3a958]: executeStatementAsync() 2018-10-23 10:52:58,700 DEBUG org.apache.hadoop.security.UserGroupInformation: [HiveServer2-Background-Pool: Thread-56]: PrivilegedAction as:hive (auth:PROXY) via hive (auth:SIMPLE) from:org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314) 2018-10-23 10:52:58,700 DEBUG org.apache.thrift.transport.TSaslTransport: [HiveServer2-Handler-Pool: Thread-54]: writing data length: 109 *2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: [HiveServer2-Background-Pool: Thread-56]: Creating new db. db.isCurrentUserOwner = false* *2018-10-23 10:52:58,715 DEBUG hive.ql.metadata.Hive: [HiveServer2-Background-Pool: Thread-56]: Closing current thread's connection to Hive Metastore.* *2018-10-23 10:52:58,715 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-56]: Closed a connection to metastore, current connections: 3* 2018-10-23 10:52:58,716 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-56]: 2018-10-23 10:52:58,716 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-56]: 2018-10-23 10:52:58,716 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-56]: Starting task [Stage-0:DDL] in serial mode 2018-10-23 10:52:58,717 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-56]: Trying to connect to metastore with URI thrift://:9083 *2018-10-23 10:52:58,717 INFO hive.metastore: [HiveServer2-**Background-Pool**: Thread-56]: Opened a connection to metastore, current connections: 4* 2018-10-23 10:52:58,720 INFO hive.metastore: [HiveServer2-Background-Pool: Thread-56]: Connected to metastore. > Leaking Metastore connections when HADOOP_USER_NAME environmental variable is > set > - > > Key: HIVE-20819 > URL: https://issues.apache.org/jira/browse/HIVE-20819 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Roohi Syeda >Assignee: Roohi Syeda >Priority: Minor > Attachments: HIVE-20819.1.patch > > > Leaking Metastore connections when HADOOP_USER_NAME environmental variable is > set. > The connections created are in ESTABLISHED state and never closed > > *More Details :* > When a new query is executed for a new session > > The handler thread, calls line 66 HiveSessionImplwithUGI > (UserGroupInformation.createProxyUser( > owner, UserGroupInformation.getLoginUser()); > > At *query compile time*, this sessionUgi is used to open MS connection by > *handler* thread > Later at *query run time*, line 277 of SQLOperation > Runnable work = > new BackgroundWork(getCurrentUGI(), parentSession.getSessionHive(), > SessionState.get(), > asyncPrepare); > > getCurrentUGI(); is used to create a new proxy user, which
[jira] [Commented] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699559#comment-16699559 ] Vihang Karajgaonkar commented on HIVE-20740: The test failures occur only on the precommit job. The logs do not have enough information to debug these failures. I will try to observer on the ptest server itself while the batch containing {{TestObjectStore}} test is running. Batch 230 has {{TestObjectStore}} {noformat} 2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: UnitTestBatch [name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, moduleName=standalone-metastore/metastore-server, batchSize=20, isParallel=true, testList=[TestMetaStoreConnectionUrlHook, TestSchemaToolForMetastore, TestMetastoreSchemaTool, TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, TestMarkPartition]] {noformat} > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699559#comment-16699559 ] Vihang Karajgaonkar edited comment on HIVE-20740 at 11/26/18 8:41 PM: -- The test failures occur only on the precommit job. The logs do not have enough information to debug these failures. I will try to observer on the ptest server itself while the batch containing {{TestObjectStore}} test is running. Batch 230 has {{TestObjectStore}} 2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: UnitTestBatch [name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, moduleName=standalone-metastore/metastore-server, batchSize=20, isParallel=true, testList=[TestMetaStoreConnectionUrlHook, TestSchemaToolForMetastore, TestMetastoreSchemaTool, TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, TestMarkPartition]] was (Author: vihangk1): The test failures occur only on the precommit job. The logs do not have enough information to debug these failures. I will try to observer on the ptest server itself while the batch containing {{TestObjectStore}} test is running. Batch 230 has {{TestObjectStore}} {noformat} 2018-11-26 20:25:19,796 DEBUG [TestExecutor] ExecutionPhase.execute:98 PBatch: UnitTestBatch [name=230_UTBatch_standalone-metastore__metastore-server_20_tests, id=230, moduleName=standalone-metastore/metastore-server, batchSize=20, isParallel=true, testList=[TestMetaStoreConnectionUrlHook, TestSchemaToolForMetastore, TestMetastoreSchemaTool, TestMetaStoreSchemaFactory, TestRetryingHMSHandler, TestAdminUser, TestJSONMessageDeserializer, TestCatalogNonDefaultSvr, TestObjectStoreInitRetry, TestHdfsUtils, TestMetaStoreServerUtils, TestHiveMetaStoreSchemaMethods, TestOldSchema, TestCachedStore, TestCatalogCaching, TestDeadline, TestMetaStoreListenersError, TestMetaStoreEventListenerOnlyOnCommit, TestMetaStoreSchemaInfo, TestMarkPartition]] {noformat} > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized
[jira] [Commented] (HIVE-20932) Vectorize Druid Storage Handler Reader
[ https://issues.apache.org/jira/browse/HIVE-20932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699537#comment-16699537 ] Gopal V commented on HIVE-20932: [~bslim]: LGTM - +1 minor nit: there's a new array list allocation for each loop, which seems somewhat of a GC thrash for no good reason. Making a DruidSerdeRow class extending ArrayList would fix that & make it less functional, but more allocation friendly. > Vectorize Druid Storage Handler Reader > -- > > Key: HIVE-20932 > URL: https://issues.apache.org/jira/browse/HIVE-20932 > Project: Hive > Issue Type: Improvement > Components: Druid integration >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20932.3.patch, HIVE-20932.4.patch, > HIVE-20932.5.patch, HIVE-20932.6.patch, HIVE-20932.7.patch, > HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.8.patch, HIVE-20932.patch > > > This patch aims at adding support for vectorize read of data from Druid to > Hive. > [~t3rmin4t0r] suggested that this will improve the performance of the top > level operators that supports vectorization. > As a first cut am just adding a wrapper around the existing Record Reader to > read up to 1024 row at a time. > Future work will be to avoid going via old reader and convert straight the > Json (smile format) to Vector primitive types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20971: -- Status: Patch Available (was: Open) [~vihangk1]: Could you please review? Thanks, Peter > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20971: -- Attachment: HIVE-20971.patch > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20971.patch > > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699519#comment-16699519 ] Hive QA commented on HIVE-20969: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949524/HIVE-20969.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15539 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15060/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15060/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15060/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12949524 - PreCommit-HIVE-Build > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.2.patch, HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20971) TestJdbcWithDBTokenStore[*] should both use MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb
[ https://issues.apache.org/jira/browse/HIVE-20971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary reassigned HIVE-20971: - > TestJdbcWithDBTokenStore[*] should both use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb > --- > > Key: HIVE-20971 > URL: https://issues.apache.org/jira/browse/HIVE-20971 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > The original intent was to use > MiniHiveKdc.getMiniHS2WithKerbWithRemoteHMSWithKerb in both cases -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20955) Calcite Rule HiveExpandDistinctAggregatesRule seems throwing IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-20955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699468#comment-16699468 ] Vineet Garg commented on HIVE-20955: [~bslim] I tried adding this test to druidmini_expressions but I am unable to run TestMiniDruidCliDriver tests on my machine. I am running into error: {noformat} [ERROR] testCliDriver[druidmini_expressions](org.apache.hadoop.hive.cli.TestMiniDruidCliDriver) Time elapsed: 9.077 s <<< FAILURE! java.lang.AssertionError: Failed during initFromDatasets processLine with code=2 at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hive.ql.QTestUtil.initDataset(QTestUtil.java:1110) at org.apache.hadoop.hive.ql.QTestUtil.initDataSetForTest(QTestUtil.java:1091) at org.apache.hadoop.hive.ql.QTestUtil.cliInit(QTestUtil.java:1148) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:60) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) {noformat} > Calcite Rule HiveExpandDistinctAggregatesRule seems throwing > IndexOutOfBoundsException > -- > > Key: HIVE-20955 > URL: https://issues.apache.org/jira/browse/HIVE-20955 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: slim bouguerra >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-20955.1.patch > > > > Adde the following query to Druid test > ql/src/test/queries/clientpositive/druidmini_expressions.q > {code} > select count(distinct `__time`, cint) from (select * from > druid_table_alltypesorc) as src; > {code} > leads to error \{code} 2018-11-21T07:36:39,449 ERROR [main] QTestUtil: Client > execution failed with error code = 4 running "\{code} > with exception stack > {code} > 2018-11-21T07:36:39,443 ERROR [ecd48683-0286-4cb4-b0ad-e150fab51038 main] > parse.CalcitePlanner: CBO failed, skipping CBO. > java.lang.IndexOutOfBoundsException: index (1) must be less than size (1) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:310) > ~[guava-19.0.jar:?] > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:293) > ~[guava-19.0.jar:?] > at > com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:41) > ~[guava-19.0.jar:?] > at > org.apache.calcite.rel.metadata.RelMdColumnOrigins.getColumnOrigins(RelMdColumnOrigins.java:77) > ~[calcite-core-1.17.0.jar:1.17.0] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins_$(Unknown Source) > ~[?:?] > at GeneratedMetadataHandler_ColumnOrigin.getColumnOrigins(Unknown Source) > ~[?:?] > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getColumnOrigins(RelMetadataQuery.java:345) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveExpandDistinctAggregatesRule.onMatch(HiveExpandDistinctAggregatesRule.java:168) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:315) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) > ~[calcite-core-1.17.0.jar:1.17.0] > at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) > ~[calcite-core-1.17.0.jar:1.17.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2363) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2314) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699470#comment-16699470 ] Hive QA commented on HIVE-20969: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 37s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15060/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15060/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.2.patch, HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at >
[jira] [Updated] (HIVE-20775) Factor cost of each SJ reduction when costing a follow-up reduction
[ https://issues.apache.org/jira/browse/HIVE-20775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20775: --- Attachment: HIVE-20775.06.patch > Factor cost of each SJ reduction when costing a follow-up reduction > --- > > Key: HIVE-20775 > URL: https://issues.apache.org/jira/browse/HIVE-20775 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20775.01.patch, HIVE-20775.02.patch, > HIVE-20775.03.patch, HIVE-20775.04.patch, HIVE-20775.05.patch, > HIVE-20775.06.patch, HIVE-20775.patch > > > Currently, while costing the SJ in a plan, the stats of the a TS that is > reduced by a SJ are not adjusted after we have decided to keep a SJ in the > tree. Ideally, we could adjust the stats to take into account decisions that > have already been made. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20740) Remove global lock in ObjectStore.setConf method
[ https://issues.apache.org/jira/browse/HIVE-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-20740: --- Attachment: HIVE-20740.13.patch > Remove global lock in ObjectStore.setConf method > > > Key: HIVE-20740 > URL: https://issues.apache.org/jira/browse/HIVE-20740 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-20740.01.patch, HIVE-20740.02.patch, > HIVE-20740.04.patch, HIVE-20740.05.patch, HIVE-20740.06.patch, > HIVE-20740.08.patch, HIVE-20740.09.patch, HIVE-20740.10.patch, > HIVE-20740.11.patch, HIVE-20740.12.patch, HIVE-20740.13.patch > > > The ObjectStore#setConf method has a global lock which can block other > clients in concurrent workloads. > {code} > @Override > @SuppressWarnings("nls") > public void setConf(Configuration conf) { > // Although an instance of ObjectStore is accessed by one thread, there > may > // be many threads with ObjectStore instances. So the static variables > // pmf and prop need to be protected with locks. > pmfPropLock.lock(); > try { > isInitialized = false; > this.conf = conf; > this.areTxnStatsSupported = MetastoreConf.getBoolVar(conf, > ConfVars.HIVE_TXN_STATS_ENABLED); > configureSSL(conf); > Properties propsFromConf = getDataSourceProps(conf); > boolean propsChanged = !propsFromConf.equals(prop); > if (propsChanged) { > if (pmf != null){ > clearOutPmfClassLoaderCache(pmf); > if (!forTwoMetastoreTesting) { > // close the underlying connection pool to avoid leaks > pmf.close(); > } > } > pmf = null; > prop = null; > } > assert(!isActiveTransaction()); > shutdown(); > // Always want to re-create pm as we don't know if it were created by > the > // most recent instance of the pmf > pm = null; > directSql = null; > expressionProxy = null; > openTrasactionCalls = 0; > currentTransaction = null; > transactionStatus = TXN_STATUS.NO_STATE; > initialize(propsFromConf); > String partitionValidationRegex = > MetastoreConf.getVar(this.conf, > ConfVars.PARTITION_NAME_WHITELIST_PATTERN); > if (partitionValidationRegex != null && > !partitionValidationRegex.isEmpty()) { > partitionValidationPattern = > Pattern.compile(partitionValidationRegex); > } else { > partitionValidationPattern = null; > } > // Note, if metrics have not been initialized this will return null, > which means we aren't > // using metrics. Thus we should always check whether this is non-null > before using. > MetricRegistry registry = Metrics.getRegistry(); > if (registry != null) { > directSqlErrors = > Metrics.getOrCreateCounter(MetricsConstants.DIRECTSQL_ERRORS); > } > this.batchSize = MetastoreConf.getIntVar(conf, > ConfVars.RAWSTORE_PARTITION_BATCH_SIZE); > if (!isInitialized) { > throw new RuntimeException( > "Unable to create persistence manager. Check dss.log for details"); > } else { > LOG.debug("Initialized ObjectStore"); > } > } finally { > pmfPropLock.unlock(); > } > } > {code} > The {{pmfPropLock}} is a static object and it disallows any other new > connection to HMS which is trying to instantiate ObjectStore. We should > either remove the lock or reduce the scope of the lock so that it is held for > a very small amount of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699448#comment-16699448 ] Hive QA commented on HIVE-20440: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949511/HIVE-20440.13.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15543 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=195) [druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15059/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15059/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15059/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949511 - PreCommit-HIVE-Build > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch, HIVE-20440.13.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699412#comment-16699412 ] Sahil Takiar commented on HIVE-20969: - +1 LGTM pending tests. > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.2.patch, HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699399#comment-16699399 ] Hive QA commented on HIVE-20440: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 31s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 50s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 fixed = 54 total (was 56) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 total (was 2312) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s{color} | {color:green} hive-unit in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile xml | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15059/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15059/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15059/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch,
[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20969: -- Status: Patch Available (was: In Progress) Fixed test case > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.2.patch, HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20969: -- Attachment: HIVE-20969.2.patch > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.2.patch, HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699373#comment-16699373 ] Peter Vary commented on HIVE-20969: --- [~stakiar]: Thanks! Exactly my thoughts. I arrived to similar conclusion after some code digging. See the attached proposed patch. What do you think? > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-20969 started by Peter Vary. - > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699361#comment-16699361 ] Hive QA commented on HIVE-20954: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949508/HIVE-20954.3.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 15542 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=197) [druidmini_masking.q,druidmini_joins.q,druid_timestamptz.q] org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testACIDwithSchemaEvolutionAndCompaction (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidOrcWritePreservesFieldNames (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAcidWithSchemaEvolution (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testAlterTable (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketCodec (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testBucketizedInputFormat (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCleanerForTxnToWriteId (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testCompactWithDelete (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDeleteIn (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testDynamicPartitionsMerge2 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testETLSplitStrategyForACID (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testEmptyInTblproperties (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFailHeartbeater (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testFileSystemUnCaching (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInitiatorWithMultipleFailedCompactions (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite1 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwrite2 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInsertOverwriteWithSelfJoin (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge2 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge3 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMergeWithPredicate (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMmTableCompaction (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsert (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMultiInsertStatement (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNoHistory (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidInsert (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion02 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion1 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion2 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testNonAcidToAcidConversion3 (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOpenTxnsCounter (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcNoPPD (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOrcPPD (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testOriginalFileReaderWhenNonAcidConvertedToAcid (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testUpdateMixedCase (batchId=320) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testValidTxnsBookkeeping (batchId=320)
[jira] [Updated] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-20969: -- Attachment: HIVE-20969.patch > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-20969.patch > > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699343#comment-16699343 ] Sahil Takiar commented on HIVE-20969: - The intention of HIVE-19008 was so simplify the session id logic in HoS. Before HIVE-19008, the HoS session id was a UUID that was completely independent of the session id. After HIVE-19008, the HoS session id is a counter that is incremented for each each new Spark session created for a given Hive session. {quote} I would assume that it would be good to connect the spark session to the hive session in every log message so it would be good if the sparkSessionId would contain the hive session id too. \{quote} Adding the hive session id into the spark session id sounds like a reasonable idea to me. Logically, that is what HIVE-19008 already does. After HIVE-19008, any spark session id is globally identifiable by the Hive session id + Spark session id. Again, prior to HIVE-19008 the sparkSessionId was a UUID that was independent of the hive session id. > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20963) Handle C-Style comments in hive query
[ https://issues.apache.org/jira/browse/HIVE-20963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699341#comment-16699341 ] Alan Gates commented on HIVE-20963: --- Zoltan is correct, // is not standard SQL. And Hive does support /* */ style, as can be seen from some of the unit tests that use them, e.g. comment.q > Handle C-Style comments in hive query > - > > Key: HIVE-20963 > URL: https://issues.apache.org/jira/browse/HIVE-20963 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Shubhangi Pardeshi >Priority: Major > > h3. Problem > Currently only Std. SQL. style comment i.e. "–" can be used in query. > Requesting to add support for C-Style single line as well as multiline > comments. > 1. /* */ > 2. /* > */ > 3. // > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699304#comment-16699304 ] Hive QA commented on HIVE-20954: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 46s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 36s{color} | {color:red} ql: The patch generated 9 new + 22 unchanged - 1 fixed = 31 total (was 23) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15058/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15058/yetus/diff-checkstyle-ql.txt | | modules | C: ql itests U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15058/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, > HIVE-20954.3.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699254#comment-16699254 ] Hive QA commented on HIVE-20794: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949506/HIVE-20794.06 {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15629 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15057/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15057/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15057/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12949506 - PreCommit-HIVE-Build > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699233#comment-16699233 ] Hive QA commented on HIVE-20794: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 54s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 13s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 2s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 38s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 35s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 36s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 41s{color} | {color:blue} itests/util in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 21s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} The patch standalone-metastore passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 fixed = 17 total (was 21) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch service passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} The patch hive-unit passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch util passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s{color} | {color:red} service generated 1 new + 48 unchanged - 0 fixed = 49 total (was 48) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | || || || ||
[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Sinkovits updated HIVE-20440: --- Attachment: HIVE-20440.13.patch > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch, HIVE-20440.13.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699154#comment-16699154 ] Antal Sinkovits commented on HIVE-20440: Test failiure not related. Uploading again. > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch, HIVE-20440.13.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699140#comment-16699140 ] Hive QA commented on HIVE-20440: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949494/HIVE-20440.12.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15548 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=61) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15056/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15056/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15056/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949494 - PreCommit-HIVE-Build > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699066#comment-16699066 ] Teddy Choi commented on HIVE-20954: --- I can't reproduce it on my laptop. So I'm uploading it again to trigger a build. > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkObjectHashOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | partitionColumnNums: [16] | > | valueColumnNums: [14] | > ++ > | Explain | > ++ > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkLongOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | valueColumnNums: [14] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Execution mode: vectorized, llap | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20954) Vector RS operator is not using uniform hash function for TPC-DS query 95
[ https://issues.apache.org/jira/browse/HIVE-20954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-20954: -- Attachment: HIVE-20954.3.patch > Vector RS operator is not using uniform hash function for TPC-DS query 95 > - > > Key: HIVE-20954 > URL: https://issues.apache.org/jira/browse/HIVE-20954 > Project: Hive > Issue Type: Improvement >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20954.1.patch, HIVE-20954.2.patch, > HIVE-20954.3.patch > > > Distribution of rows is skewed in DHJ causing slowdown. > Same RS outputs, but the two branches use VectorReduceSinkObjectHashOperator > and VectorReduceSinkLongOperator. > {code} > | Select Operator| > | expressions: ws_warehouse_sk (type: bigint), > ws_order_number (type: bigint) | > | outputColumnNames: _col0, _col1 | > | Select Vectorization:| > | className: VectorSelectOperator | > | native: true | > | projectedOutputColumnNums: [14, 16] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkObjectHashOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | partitionColumnNums: [16] | > | valueColumnNums: [14] | > ++ > | Explain | > ++ > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Reduce Output Operator | > | key expressions: _col1 (type: bigint) | > | sort order: + | > | Map-reduce partition columns: _col1 (type: bigint) | > | Reduce Sink Vectorization: | > | className: VectorReduceSinkLongOperator | > | keyColumnNums: [16]| > | native: true | > | nativeConditionsMet: > hive.vectorized.execution.reducesink.new.enabled IS true, > hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No > DISTINCT columns IS true, BinarySortableSerDe for keys IS true, > LazyBinarySerDe for values IS true | > | valueColumnNums: [14] | > | Statistics: Num rows: 7199963324 Data size: > 115185006696 Basic stats: COMPLETE Column stats: COMPLETE | > | value expressions: _col0 (type: bigint) | > | Execution mode: vectorized, llap | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Attachment: HIVE-20794.06 Status: Patch Available (was: In Progress) Patch fixing checkstyle, findbug errors from the previous runs. > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05, HIVE-20794.06 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-20794: -- Status: In Progress (was: Patch Available) > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server down it should deregister itself from > Zookeeper, when configured. > # These changes use the refactored code described above. > h3. HiveMetaStoreClient class changes > When service discovery mode is zookeeper, we fetch the metatstore URIs from > the specified ZooKeeper and treat those as if they were specified in > THRIFT_URIS i.e. use the existing mechanisms to choose a metastore server to > connect to and establish a connection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16699043#comment-16699043 ] Hive QA commented on HIVE-20440: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 43s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} ql: The patch generated 0 new + 54 unchanged - 2 fixed = 54 total (was 56) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} ql generated 0 new + 2311 unchanged - 1 fixed = 2311 total (was 2312) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s{color} | {color:green} hive-unit in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 13s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile xml | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15056/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15056/yetus/diff-checkstyle-itests_hive-unit.txt | | modules | C: ql itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15056/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch,
[jira] [Updated] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Sinkovits updated HIVE-20440: --- Attachment: HIVE-20440.12.patch > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20440) Create better cache eviction policy for SmallTableCache
[ https://issues.apache.org/jira/browse/HIVE-20440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698914#comment-16698914 ] Antal Sinkovits commented on HIVE-20440: Rebase > Create better cache eviction policy for SmallTableCache > --- > > Key: HIVE-20440 > URL: https://issues.apache.org/jira/browse/HIVE-20440 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Antal Sinkovits >Assignee: Antal Sinkovits >Priority: Major > Attachments: HIVE-20440.01.patch, HIVE-20440.02.patch, > HIVE-20440.03.patch, HIVE-20440.04.patch, HIVE-20440.05.patch, > HIVE-20440.06.patch, HIVE-20440.07.patch, HIVE-20440.08.patch, > HIVE-20440.09.patch, HIVE-20440.10.patch, HIVE-20440.11.patch, > HIVE-20440.12.patch > > > Enhance the SmallTableCache, to use guava cache with soft references, so that > we evict when there is memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698807#comment-16698807 ] Peter Vary commented on HIVE-20969: --- My current theory is that HIVE-19008 changed sparkSessionId generation which affected scratchDir creation. [~stakiar]: Could you help out me here? What was the original intention here? I would assume that it would be good to connect the spark session to the hive session in every log message so it would be good if the sparkSessionId would contain the hive session id too. Otherwise when we have multiple HoS queries running on the same HS2 instance then we will have hard time differentiating between the multiple spark sessions with id="1". [~ngangam]: Any thoughts on this? > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs
[ https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698778#comment-16698778 ] Hive QA commented on HIVE-20760: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949463/HIVE-20760.8.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 15500 tests executed *Failed tests:* {noformat} TestCompactor - did not produce a TEST-*.xml file (likely timed out) (batchId=244) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (batchId=182) org.apache.hadoop.hive.ql.TestTxnCommandsForMmTable.testOperationsOnCompletedTxnComponentsForMmTable (batchId=284) org.apache.hadoop.hive.ql.TestTxnCommandsForOrcMmTable.testOperationsOnCompletedTxnComponentsForMmTable (batchId=306) org.apache.hadoop.hive.ql.TestTxnConcatenate.testConcatenateMM (batchId=293) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15055/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15055/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15055/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949463 - PreCommit-HIVE-Build > Reducing memory overhead due to multiple HiveConfs > -- > > Key: HIVE-20760 > URL: https://issues.apache.org/jira/browse/HIVE-20760 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Barnabas Maidics >Assignee: Barnabas Maidics >Priority: Major > Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, > HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, > HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, > hiveconf_interned.html, hiveconf_original.html > > > The issue is that every Hive task has to load its own version of > {{HiveConf}}. When running with a large number of cores per executor (HoS), > there is a significant (~10%) amount of memory wasted due to this > duplication. > I looked into the problem and found a way to reduce the overhead caused by > the multiple HiveConf objects. > I've created an implementation of Properties, somewhat similar to > CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve > this problem, because it drops the interned Properties right after we add a > new property. > So my implementation looks like this: > * When we create a new HiveConf from an existing one (copy constructor), we > change the properties object stored by HiveConf to the new Properties > implementation (HiveConfProperties). We have 2 possible way to do this. > Either we change the visibility of the properties field in the ancestor class > (Configuration which comes from hadoop) to protected, or a simpler way is to > just change the type using reflection. > * HiveConfProperties instantly intern the given properties. After this, > every time we add a new property to HiveConf, we add it to an additional > Properties object. This way if we create multiple HiveConf with the same base > properties, they will use the same Properties object but each session/task > can add its own unique properties. > * Getting a property from HiveConfProperties would look like this: (I stored > the non-interned properties in super class) > String property=super.getProperty(key); > if (property == null) property= interned.getProperty(key); > return property; > Running some tests showed that the interning works (with 50 connections to > HiveServer2, heapdumps created after sessions are created for queries): > Overall memory: > original: 34,599K interned: 20,582K > Retained memory of HiveConfs: > original: 16,366K interned: 10,804K > I attach the JXray reports about the heapdumps. > What are your thoughts about this solution? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20969) HoS sessionId generation can cause race conditions when uploading files to HDFS
[ https://issues.apache.org/jira/browse/HIVE-20969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary reassigned HIVE-20969: - > HoS sessionId generation can cause race conditions when uploading files to > HDFS > --- > > Key: HIVE-20969 > URL: https://issues.apache.org/jira/browse/HIVE-20969 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > The observed exception is: > {code} > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/hive/_spark_session_dir/0/hive-exec-2.1.1-SNAPSHOT.jar (inode 21140) > [Lease. Holder: DFSClient_NONMAPREDUCE_304217459_39, pending creates: 1] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2781) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:599) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:171) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2660) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:872) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:550) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs
[ https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698693#comment-16698693 ] Hive QA commented on HIVE-20760: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 30s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s{color} | {color:red} common: The patch generated 3 new + 426 unchanged - 0 fixed = 429 total (was 426) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 38s{color} | {color:red} common generated 3 new + 65 unchanged - 0 fixed = 68 total (was 65) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:common | | | org.apache.hadoop.hive.common.HiveConfProperties.clone() does not call super.clone() At HiveConfProperties.java: At HiveConfProperties.java:[line 260] | | | Inconsistent synchronization of org.apache.hadoop.hive.common.HiveConfProperties.interned; locked 70% of time Unsynchronized access at HiveConfProperties.java:70% of time Unsynchronized access at HiveConfProperties.java:[line 108] | | | org.apache.hadoop.hive.common.HiveConfProperties.getProperty(String, String) is unsynchronized, org.apache.hadoop.hive.common.HiveConfProperties.setProperty(String, String) is synchronized At HiveConfProperties.java:String) is synchronized At HiveConfProperties.java:[lines 123-130] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15055/dev-support/hive-personality.sh | | git revision | master / 0fee288 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus/diff-checkstyle-common.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus/new-findbugs-common.html | | modules | C: common U: common | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15055/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Reducing memory overhead due to multiple HiveConfs > -- > > Key: HIVE-20760 > URL: https://issues.apache.org/jira/browse/HIVE-20760 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Barnabas Maidics >Assignee: Barnabas Maidics >Priority: Major > Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, > HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, > HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, >
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698682#comment-16698682 ] Hive QA commented on HIVE-20794: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12949455/HIVE-20794.05 {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15624 tests executed *Failed tests:* {noformat} TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=195) [druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q] org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] (batchId=171) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15054/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15054/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15054/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12949455 - PreCommit-HIVE-Build > Use Zookeeper for metastore service discovery > - > > Key: HIVE-20794 > URL: https://issues.apache.org/jira/browse/HIVE-20794 > Project: Hive > Issue Type: Improvement >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, > HIVE-20794.03, HIVE-20794.04, HIVE-20794.05 > > > Right now, multiple metastore services can be specified in > hive.metastore.uris configuration, but that list is static and can not be > modified dynamically. Use Zookeeper for dynamic service discovery of > metastore. > h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome) > The Zookeeper related code (for service discovery) accesses Zookeeper > parameters directly from HiveConf. The class is changed so that it could be > used for both HiveServer2 and Metastore server and works with both the > configurations. Following methods from HiveServer2 are now moved into > ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # > removeServerInstanceFromZooKeeper > h3. HiveMetaStore conf changes > # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper > quorum. When THRIFT_SERVICE_DISCOVERY_MODE > (hive.metastore.service.discovery.mode) is set to "zookeeper" the URIs are > used as ZooKeeper quorum. When it's set to be empty, the URIs are used to > locate the metastore directly. > # Here's list of Hiveserver2's parameters and their proposed metastore conf > counterparts. It looks odd that the Metastore related configurations do not > have their macros start with METASTORE, but start with THRIFT. I have just > followed naming convention used for other parameters. > ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE > (hive.metastore.zookeeper.namespace) > ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT > (hive.metastore.zookeeper.client.port) > ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - > (hive.metastore.zookeeper.connection.timeout) > ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - > THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES > (hive.metastore.zookeeper.connection.max.retries) > ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - > THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME > (hive.metastore.zookeeper.connection.basesleeptime) > # Additional configuration THRIFT_BIND_HOST is used to specify the host > address to bind Metastore service to. Right now Metastore binds to *, i.e all > addresses. Metastore doesn't then know which of those addresses it should add > to the ZooKeeper. THRIFT_BIND_HOST solves that problem. When this > configuration is specified the metastore server binds to that address and > also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper. > Following Hive ZK configurations seem to be related to managing locks and > seem irrelevant for MS ZK. > # HIVE_ZOOKEEPER_SESSION_TIMEOUT > # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES > Since there is no configuration to be published, > HIVE_ZOOKEEPER_PUBLISH_CONFIGS does not have a THRIFT counterpart. > h3. HiveMetaStore class changes > # startMetaStore should also register the instance with Zookeeper, when > configured. > # When shutting a metastore server
[jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
[ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698670#comment-16698670 ] Hive QA commented on HIVE-20794: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 29s{color} | {color:blue} common in master has 65 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 15s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 1m 5s{color} | {color:blue} standalone-metastore/metastore-server in master has 185 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} service in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 42s{color} | {color:blue} itests/util in master has 48 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 14s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} The patch standalone-metastore passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch common passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 6s{color} | {color:green} The patch metastore-server passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} ql: The patch generated 0 new + 17 unchanged - 4 fixed = 17 total (was 21) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} service: The patch generated 3 new + 35 unchanged - 0 fixed = 38 total (was 35) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch hive-unit passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch util passed checkstyle {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} service generated 1 new + 48 unchanged - 0 fixed = 49 total (was 48) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 9s{color} | {color:green} the patch
[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs
[ https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barnabas Maidics updated HIVE-20760: Status: Open (was: Patch Available) > Reducing memory overhead due to multiple HiveConfs > -- > > Key: HIVE-20760 > URL: https://issues.apache.org/jira/browse/HIVE-20760 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Barnabas Maidics >Assignee: Barnabas Maidics >Priority: Major > Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, > HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, > HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, > hiveconf_interned.html, hiveconf_original.html > > > The issue is that every Hive task has to load its own version of > {{HiveConf}}. When running with a large number of cores per executor (HoS), > there is a significant (~10%) amount of memory wasted due to this > duplication. > I looked into the problem and found a way to reduce the overhead caused by > the multiple HiveConf objects. > I've created an implementation of Properties, somewhat similar to > CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve > this problem, because it drops the interned Properties right after we add a > new property. > So my implementation looks like this: > * When we create a new HiveConf from an existing one (copy constructor), we > change the properties object stored by HiveConf to the new Properties > implementation (HiveConfProperties). We have 2 possible way to do this. > Either we change the visibility of the properties field in the ancestor class > (Configuration which comes from hadoop) to protected, or a simpler way is to > just change the type using reflection. > * HiveConfProperties instantly intern the given properties. After this, > every time we add a new property to HiveConf, we add it to an additional > Properties object. This way if we create multiple HiveConf with the same base > properties, they will use the same Properties object but each session/task > can add its own unique properties. > * Getting a property from HiveConfProperties would look like this: (I stored > the non-interned properties in super class) > String property=super.getProperty(key); > if (property == null) property= interned.getProperty(key); > return property; > Running some tests showed that the interning works (with 50 connections to > HiveServer2, heapdumps created after sessions are created for queries): > Overall memory: > original: 34,599K interned: 20,582K > Retained memory of HiveConfs: > original: 16,366K interned: 10,804K > I attach the JXray reports about the heapdumps. > What are your thoughts about this solution? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20760) Reducing memory overhead due to multiple HiveConfs
[ https://issues.apache.org/jira/browse/HIVE-20760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barnabas Maidics updated HIVE-20760: Attachment: HIVE-20760.8.patch Status: Patch Available (was: Open) > Reducing memory overhead due to multiple HiveConfs > -- > > Key: HIVE-20760 > URL: https://issues.apache.org/jira/browse/HIVE-20760 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Barnabas Maidics >Assignee: Barnabas Maidics >Priority: Major > Attachments: HIVE-20760-1.patch, HIVE-20760-2.patch, > HIVE-20760-3.patch, HIVE-20760.4.patch, HIVE-20760.5.patch, > HIVE-20760.6.patch, HIVE-20760.7.patch, HIVE-20760.8.patch, HIVE-20760.patch, > hiveconf_interned.html, hiveconf_original.html > > > The issue is that every Hive task has to load its own version of > {{HiveConf}}. When running with a large number of cores per executor (HoS), > there is a significant (~10%) amount of memory wasted due to this > duplication. > I looked into the problem and found a way to reduce the overhead caused by > the multiple HiveConf objects. > I've created an implementation of Properties, somewhat similar to > CopyOnFirstWriteProperties. CopyOnFirstWriteProperties can't be used to solve > this problem, because it drops the interned Properties right after we add a > new property. > So my implementation looks like this: > * When we create a new HiveConf from an existing one (copy constructor), we > change the properties object stored by HiveConf to the new Properties > implementation (HiveConfProperties). We have 2 possible way to do this. > Either we change the visibility of the properties field in the ancestor class > (Configuration which comes from hadoop) to protected, or a simpler way is to > just change the type using reflection. > * HiveConfProperties instantly intern the given properties. After this, > every time we add a new property to HiveConf, we add it to an additional > Properties object. This way if we create multiple HiveConf with the same base > properties, they will use the same Properties object but each session/task > can add its own unique properties. > * Getting a property from HiveConfProperties would look like this: (I stored > the non-interned properties in super class) > String property=super.getProperty(key); > if (property == null) property= interned.getProperty(key); > return property; > Running some tests showed that the interning works (with 50 connections to > HiveServer2, heapdumps created after sessions are created for queries): > Overall memory: > original: 34,599K interned: 20,582K > Retained memory of HiveConfs: > original: 16,366K interned: 10,804K > I attach the JXray reports about the heapdumps. > What are your thoughts about this solution? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-20330: -- Status: In Progress (was: Patch Available) > HCatLoader cannot handle multiple InputJobInfo objects for a job with > multiple inputs > - > > Key: HIVE-20330 > URL: https://issues.apache.org/jira/browse/HIVE-20330 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, > HIVE-20330.2.patch > > > While running performance tests on Pig (0.12 and 0.17) we've observed a huge > performance drop in a workload that has multiple inputs from HCatLoader. > The reason is that for a particular MR job with multiple Hive tables as > input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance > but only one table's information (InputJobInfo instance) gets tracked in the > JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}). > Any such call overwrites preexisting values, and thus only the last table's > information will be considered when Pig calls {{getStatistics}} to calculate > and estimate required reducer count. > In cases when there are 2 input tables, 256GB and 1MB in size respectively, > Pig will query the size information from HCat for both of them, but it will > either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the > execution plan's DAG. > It should of course see 256.00097GB in total and use 257 reducers by default > accordingly. > In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle > with the actual 256.00097GB... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-20330: -- Status: Patch Available (was: In Progress) > HCatLoader cannot handle multiple InputJobInfo objects for a job with > multiple inputs > - > > Key: HIVE-20330 > URL: https://issues.apache.org/jira/browse/HIVE-20330 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, > HIVE-20330.2.patch > > > While running performance tests on Pig (0.12 and 0.17) we've observed a huge > performance drop in a workload that has multiple inputs from HCatLoader. > The reason is that for a particular MR job with multiple Hive tables as > input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance > but only one table's information (InputJobInfo instance) gets tracked in the > JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}). > Any such call overwrites preexisting values, and thus only the last table's > information will be considered when Pig calls {{getStatistics}} to calculate > and estimate required reducer count. > In cases when there are 2 input tables, 256GB and 1MB in size respectively, > Pig will query the size information from HCat for both of them, but it will > either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the > execution plan's DAG. > It should of course see 256.00097GB in total and use 257 reducers by default > accordingly. > In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle > with the actual 256.00097GB... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-20330: -- Status: In Progress (was: Patch Available) > HCatLoader cannot handle multiple InputJobInfo objects for a job with > multiple inputs > - > > Key: HIVE-20330 > URL: https://issues.apache.org/jira/browse/HIVE-20330 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, > HIVE-20330.2.patch > > > While running performance tests on Pig (0.12 and 0.17) we've observed a huge > performance drop in a workload that has multiple inputs from HCatLoader. > The reason is that for a particular MR job with multiple Hive tables as > input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance > but only one table's information (InputJobInfo instance) gets tracked in the > JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}). > Any such call overwrites preexisting values, and thus only the last table's > information will be considered when Pig calls {{getStatistics}} to calculate > and estimate required reducer count. > In cases when there are 2 input tables, 256GB and 1MB in size respectively, > Pig will query the size information from HCat for both of them, but it will > either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the > execution plan's DAG. > It should of course see 256.00097GB in total and use 257 reducers by default > accordingly. > In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle > with the actual 256.00097GB... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-20330: -- Attachment: (was: HIVE-20330.2.patch) > HCatLoader cannot handle multiple InputJobInfo objects for a job with > multiple inputs > - > > Key: HIVE-20330 > URL: https://issues.apache.org/jira/browse/HIVE-20330 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch > > > While running performance tests on Pig (0.12 and 0.17) we've observed a huge > performance drop in a workload that has multiple inputs from HCatLoader. > The reason is that for a particular MR job with multiple Hive tables as > input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance > but only one table's information (InputJobInfo instance) gets tracked in the > JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}). > Any such call overwrites preexisting values, and thus only the last table's > information will be considered when Pig calls {{getStatistics}} to calculate > and estimate required reducer count. > In cases when there are 2 input tables, 256GB and 1MB in size respectively, > Pig will query the size information from HCat for both of them, but it will > either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the > execution plan's DAG. > It should of course see 256.00097GB in total and use 257 reducers by default > accordingly. > In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle > with the actual 256.00097GB... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20330) HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs
[ https://issues.apache.org/jira/browse/HIVE-20330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-20330: -- Attachment: HIVE-20330.2.patch > HCatLoader cannot handle multiple InputJobInfo objects for a job with > multiple inputs > - > > Key: HIVE-20330 > URL: https://issues.apache.org/jira/browse/HIVE-20330 > Project: Hive > Issue Type: Bug > Components: HCatalog >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-20330.0.patch, HIVE-20330.1.patch, > HIVE-20330.2.patch > > > While running performance tests on Pig (0.12 and 0.17) we've observed a huge > performance drop in a workload that has multiple inputs from HCatLoader. > The reason is that for a particular MR job with multiple Hive tables as > input, Pig calls {{setLocation}} on each {{LoaderFunc (HCatLoader)}} instance > but only one table's information (InputJobInfo instance) gets tracked in the > JobConf. (This is under config key {{HCatConstants.HCAT_KEY_JOB_INFO}}). > Any such call overwrites preexisting values, and thus only the last table's > information will be considered when Pig calls {{getStatistics}} to calculate > and estimate required reducer count. > In cases when there are 2 input tables, 256GB and 1MB in size respectively, > Pig will query the size information from HCat for both of them, but it will > either see 1MB+1MB=2MB or 256GB+256GB=0.5TB depending on input order in the > execution plan's DAG. > It should of course see 256.00097GB in total and use 257 reducers by default > accordingly. > In unlucky cases this will be seen as 2MB and 1 reducer will have to struggle > with the actual 256.00097GB... -- This message was sent by Atlassian JIRA (v7.6.3#76005)