[jira] [Created] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries
Vineet Garg created HIVE-21279: -- Summary: Avoid moving/rename operation in FileSink op for SELECT queries Key: HIVE-21279 URL: https://issues.apache.org/jira/browse/HIVE-21279 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Vineet Garg Assignee: Vineet Garg Fix For: 4.0.0 Attachments: HIVE-21279.1.patch Currently at the end of a job FileSink operator moves/rename temp directory to another directory from which FetchTask fetches result. This is done to avoid fetching potential partial/invalid files by failed/runway tasks. This operation is expensive for cloud storage. It could be avoided if FetchTask is passed on set of files to read from instead of whole directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21278) Ambiguity in grammar causes
Jesus Camacho Rodriguez created HIVE-21278: -- Summary: Ambiguity in grammar causes Key: HIVE-21278 URL: https://issues.apache.org/jira/browse/HIVE-21278 Project: Hive Issue Type: Bug Components: Parser Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez These are the warnings at compilation time: {code} warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5: Decision can match input such as "KW_CHECK KW_DATETIME" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5: Decision can match input such as "KW_CHECK KW_DATE {LPAREN, StringLiteral}" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5: Decision can match input such as "KW_CHECK KW_UNIONTYPE LESSTHAN" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5: Decision can match input such as "KW_CHECK {KW_EXISTS, KW_TINYINT}" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5: Decision can match input such as "KW_CHECK KW_STRUCT LESSTHAN" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input warning(200): IdentifiersParser.g:424:5: Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 10 As a result, alternative(s) 10 were disabled for that input {code} This means that multiple parser rules can match certain query text, possibly leading to unexpected errors at parsing time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Null pointer exception on running compaction against an MM table
Aditya, Thanks for reporting this. Would you like to create a jira for this (https://issues.apache.org/jira/projects/HIVE)? Additionally, if you would like to work on a fix, I’m happy to help in reviewing. --Vaibhav From: Aditya Shah Date: Friday, February 15, 2019 at 2:05 AM To: "dev@hive.apache.org" Cc: Eugene Koifman , Vaibhav Gumashta , Gopal Vijayaraghavan Subject: Null pointer exception on running compaction against an MM table [mage removed by sender.] Hi, I was trying to run compaction on MM table but got a null pointer exception while getting HDFS session path. The error seemed to me that session state was not started for this queries. Am I missing something here? I do think session state needs to be started for each of the queries (insert into temp table etc) running for compaction (I'm also doubtful for statsupdater thread's queries) on HMS. Some details are as follows: Env./Versions: Using Hive-3.1.1 (rel/release-3.1.1) Steps to reproduce: 1) Using beeline with HS2 and HMS 2) create an MM table 3) Insert a few values in the table 4) alter table mm_table compact 'major' and wait; Stack trace on HMS: compactor.Worker: Caught exception while trying to compact id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0. Marking failed to avoid repeated failures, java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES ( 'serialization.format'='1')STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base' TBLPROPERTIES ('transactional'='false') at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241) at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES ( 'serialization.format'='1')STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base' TBLPROPERTIES ('transactional'='false') at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365) ... 2 more Caused by: java.lang.NullPointerException: Non-local session path expected to be non-null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) at org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815) at org.apache.hadoop.hive.ql.Context.(Context.java:309) at org.apache.hadoop.hive.ql.Context.(Context.java:295) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522) ... 3 more Observations:1) SessionState.start() initializes paths, hivehist etc. 2) SessionState is not started in setupSessionState() in runMMCompaction(). (There is also a comment by Sergey in the code regarding the same) 3) Even after making it start the session state it further fails in running a Teztask for insert overwrite on temp table with the contents of the original table. 4) The cause for 3) is Tezsession state is not able to initialize due to Illegal Argument exception being thrown at the time of setting up caller context in Tez task due to caller id being empty 5) Reason for 4) is queryid is an empty string for such queries. 6) A possible solution for 5) Building querystate with queryid in runOnDriver() in DriverUtils.java Do let me know if you need some more information for the same. Thanks and Regards, Aditya Shah 5th Year M.Sc.(Hons.) Mathematics & B.E.(Hons.) Computer Science and Engineering Birla Institute of Technology & Science, Pilani Vidhya Vihar Pilani 333 031(Raj.),
[jira] [Created] (HIVE-21277) Make HBaseSerde First-Class SerDe
BELUGA BEHR created HIVE-21277: -- Summary: Make HBaseSerde First-Class SerDe Key: HIVE-21277 URL: https://issues.apache.org/jira/browse/HIVE-21277 Project: Hive Issue Type: New Feature Components: HBase Handler Reporter: BELUGA BEHR Make HBase integration with Hive first class. {code:sql} CREATE TABLE...STORED AS HBASE; {code} https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF
pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF URL: https://github.com/apache/hive/pull/538#discussion_r257274581 ## File path: ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java ## @@ -44,10 +49,207 @@ public ValueBoundaryScanner(BoundaryDef start, BoundaryDef end, boolean nullsLas this.nullsLast = nullsLast; } + public abstract Object computeValue(Object row) throws HiveException; + + /** + * Checks if the distance of v2 to v1 is greater than the given amt. + * @return True if the value of v1 - v2 is greater than amt or either value is null. + */ + public abstract boolean isDistanceGreater(Object v1, Object v2, int amt); + + /** + * Checks if the values of v1 or v2 are the same. + * @return True if both values are the same or both are nulls. + */ + public abstract boolean isEqual(Object v1, Object v2); + public abstract int computeStart(int rowIdx, PTFPartition p) throws HiveException; public abstract int computeEnd(int rowIdx, PTFPartition p) throws HiveException; + /** + * Checks and maintains cache content - optimizes cache window to always be around current row + * thereby makes it follow the current progress. + * @param rowIdx current row + * @param p current partition for the PTF operator + * @throws HiveException + */ + public void handleCache(int rowIdx, PTFPartition p) throws HiveException { +BoundaryCache cache = p.getBoundaryCache(); +if (cache == null) { + return; +} + +//Start of partition +if (rowIdx == 0) { + cache.clear(); +} +if (cache.isComplete()) { + return; +} + +int cachePos = cache.approxCachePositionOf(rowIdx); + +if (cache.isEmpty()) { + fillCacheUntilEndOrFull(rowIdx, p); +} else if (cachePos > 50 && cachePos <= 75) { + if (!start.isPreceding() && end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } +} else if (cachePos > 75 && cachePos <= 95) { + if (start.isPreceding() && end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } +} else if (cachePos >= 95) { + if (start.isPreceding() && !end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } + +} + } + + /** + * Inserts values into cache starting from rowIdx in the current partition p. Stops if cache + * reaches its maximum size or we get out of rows in p. + * @param rowIdx + * @param p + * @throws HiveException + */ + private void fillCacheUntilEndOrFull(int rowIdx, PTFPartition p) throws HiveException { +BoundaryCache cache = p.getBoundaryCache(); +if (cache == null || p.size() <= 0) { + return; +} + +//If we continue building cache +Map.Entry ceilingEntry = cache.getMaxEntry(); +if (ceilingEntry != null) { + rowIdx = ceilingEntry.getKey(); +} + +Object rowVal = null; +Object lastRowVal = null; + +while (rowIdx < p.size()) { + rowVal = computeValue(p.getAt(rowIdx)); + if (!isEqual(rowVal, lastRowVal)){ +if (!cache.putIfNotFull(rowIdx, rowVal)){ Review comment: Should not we detect that the cache is full before continuing to read? Do we end up reading the lines for rowVal twice? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF
pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF URL: https://github.com/apache/hive/pull/538#discussion_r257280122 ## File path: ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java ## @@ -44,10 +49,207 @@ public ValueBoundaryScanner(BoundaryDef start, BoundaryDef end, boolean nullsLas this.nullsLast = nullsLast; } + public abstract Object computeValue(Object row) throws HiveException; + + /** + * Checks if the distance of v2 to v1 is greater than the given amt. + * @return True if the value of v1 - v2 is greater than amt or either value is null. + */ + public abstract boolean isDistanceGreater(Object v1, Object v2, int amt); + + /** + * Checks if the values of v1 or v2 are the same. + * @return True if both values are the same or both are nulls. + */ + public abstract boolean isEqual(Object v1, Object v2); + public abstract int computeStart(int rowIdx, PTFPartition p) throws HiveException; public abstract int computeEnd(int rowIdx, PTFPartition p) throws HiveException; + /** + * Checks and maintains cache content - optimizes cache window to always be around current row + * thereby makes it follow the current progress. + * @param rowIdx current row + * @param p current partition for the PTF operator + * @throws HiveException + */ + public void handleCache(int rowIdx, PTFPartition p) throws HiveException { +BoundaryCache cache = p.getBoundaryCache(); +if (cache == null) { + return; +} + +//Start of partition +if (rowIdx == 0) { + cache.clear(); +} +if (cache.isComplete()) { + return; +} + +int cachePos = cache.approxCachePositionOf(rowIdx); + +if (cache.isEmpty()) { + fillCacheUntilEndOrFull(rowIdx, p); +} else if (cachePos > 50 && cachePos <= 75) { + if (!start.isPreceding() && end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } +} else if (cachePos > 75 && cachePos <= 95) { + if (start.isPreceding() && end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } +} else if (cachePos >= 95) { + if (start.isPreceding() && !end.isFollowing()) { +cache.evictHalf(); +fillCacheUntilEndOrFull(rowIdx, p); + } + +} + } + + /** + * Inserts values into cache starting from rowIdx in the current partition p. Stops if cache + * reaches its maximum size or we get out of rows in p. + * @param rowIdx + * @param p + * @throws HiveException + */ + private void fillCacheUntilEndOrFull(int rowIdx, PTFPartition p) throws HiveException { +BoundaryCache cache = p.getBoundaryCache(); +if (cache == null || p.size() <= 0) { + return; +} + +//If we continue building cache +Map.Entry ceilingEntry = cache.getMaxEntry(); +if (ceilingEntry != null) { + rowIdx = ceilingEntry.getKey(); +} + +Object rowVal = null; +Object lastRowVal = null; + +while (rowIdx < p.size()) { + rowVal = computeValue(p.getAt(rowIdx)); + if (!isEqual(rowVal, lastRowVal)){ +if (!cache.putIfNotFull(rowIdx, rowVal)){ + break; +} + } + lastRowVal = rowVal; + ++rowIdx; + +} +//Signaling end of all rows in a partition +if (cache.putIfNotFull(rowIdx, null)) { + cache.setComplete(true); +} + } + + /** + * Uses cache content to jump backwards if possible. If not, it steps one back. + * @param r + * @param p + * @return pair of (row we stepped/jumped onto ; row value at this position) + * @throws HiveException + */ + protected Pair skipOrStepBack(int r, PTFPartition p) + throws HiveException { +Object rowVal = null; +BoundaryCache cache = p.getBoundaryCache(); + +Map.Entry floorEntry = null; +Map.Entry ceilingEntry = null; + +if (cache != null) { + floorEntry = cache.floorEntry(r); + ceilingEntry = cache.ceilingEntry(r); +} + +if (floorEntry != null && ceilingEntry != null) { + r = floorEntry.getKey() - 1; + floorEntry = cache.floorEntry(r); + if (floorEntry != null) { +rowVal = floorEntry.getValue(); + } else if (r >= 0){ +rowVal = computeValue(p.getAt(r)); + } +} else { + r--; + if (r >= 0) { +rowVal = computeValue(p.getAt(r)); + } +} +return new ImmutablePair<>(r, rowVal); + } + + /** + * Uses cache content to jump forward if possible. If not, it steps one forward. + * @param r + * @param p + * @return pair of (row we stepped/jumped onto ; row value at this position) + * @throws HiveException + */ + protected Pair skipOrStepForward(int r, PTFPartition p) + throws HiveException { +Object rowVal = null; +BoundaryCache cache = p.getBoundaryCache(); + +Map.Entry floorEntry = null; +Map.Entry ceilingEntry = null;
[GitHub] pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF
pvary commented on a change in pull request #538: HIVE-21217: Optimize range calculation for PTF URL: https://github.com/apache/hive/pull/538#discussion_r257278647 ## File path: ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java ## @@ -44,10 +49,207 @@ public ValueBoundaryScanner(BoundaryDef start, BoundaryDef end, boolean nullsLas this.nullsLast = nullsLast; } + public abstract Object computeValue(Object row) throws HiveException; + + /** + * Checks if the distance of v2 to v1 is greater than the given amt. + * @return True if the value of v1 - v2 is greater than amt or either value is null. + */ + public abstract boolean isDistanceGreater(Object v1, Object v2, int amt); + + /** + * Checks if the values of v1 or v2 are the same. + * @return True if both values are the same or both are nulls. + */ + public abstract boolean isEqual(Object v1, Object v2); + public abstract int computeStart(int rowIdx, PTFPartition p) throws HiveException; public abstract int computeEnd(int rowIdx, PTFPartition p) throws HiveException; + /** + * Checks and maintains cache content - optimizes cache window to always be around current row + * thereby makes it follow the current progress. + * @param rowIdx current row + * @param p current partition for the PTF operator + * @throws HiveException + */ + public void handleCache(int rowIdx, PTFPartition p) throws HiveException { +BoundaryCache cache = p.getBoundaryCache(); +if (cache == null) { + return; +} + +//Start of partition +if (rowIdx == 0) { + cache.clear(); +} +if (cache.isComplete()) { + return; +} + +int cachePos = cache.approxCachePositionOf(rowIdx); + +if (cache.isEmpty()) { + fillCacheUntilEndOrFull(rowIdx, p); +} else if (cachePos > 50 && cachePos <= 75) { Review comment: This is strange to me. Do we know the size of the window in advance? Shall we not use our cache accordingly? If the window size is 5 then we should cache the 7 values (1 before, 1 after, and the 5 values)? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HIVE-21276) Requesting LISA access
Raquel created HIVE-21276: - Summary: Requesting LISA access Key: HIVE-21276 URL: https://issues.apache.org/jira/browse/HIVE-21276 Project: Hive Issue Type: Test Components: Authentication Affects Versions: 2.3.4 Reporter: Raquel -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21275) Lower Logging Level in Operator Class
BELUGA BEHR created HIVE-21275: -- Summary: Lower Logging Level in Operator Class Key: HIVE-21275 URL: https://issues.apache.org/jira/browse/HIVE-21275 Project: Hive Issue Type: Improvement Affects Versions: 4.0.0, 3.2.0 Reporter: BELUGA BEHR There is an incredible amount of logging generated by the {{Operator}} during the Q-Tests. I counted more than 1 *million* lines of pretty useless logging. Please lower to TRACE level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] szlta opened a new pull request #538: HIVE-21217: Optimize range calculation for PTF
szlta opened a new pull request #538: HIVE-21217: Optimize range calculation for PTF URL: https://github.com/apache/hive/pull/538 @pvary This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257214084 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -555,6 +555,9 @@ public static ConfVars getMetaConf(String name) { EVENT_DB_LISTENER_TTL("metastore.event.db.listener.timetolive", "hive.metastore.event.db.listener.timetolive", 86400, TimeUnit.SECONDS, "time after which events will be removed from the database listener queue"), +CLEAN_MAX_EVENTS("metastore.event.db.clean.maxevents", +"hive.metastore.event.db.clean.maxevents", 1, +"Limit on number events to be cleaned at a time in metastore cleanNotificationEvents call, to avoid OOM"), Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257211901 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -555,6 +555,9 @@ public static ConfVars getMetaConf(String name) { EVENT_DB_LISTENER_TTL("metastore.event.db.listener.timetolive", "hive.metastore.event.db.listener.timetolive", 86400, TimeUnit.SECONDS, "time after which events will be removed from the database listener queue"), +CLEAN_MAX_EVENTS("metastore.event.db.clean.maxevents", Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
ashutosh-bapat commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257211237 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1142,10 +1142,8 @@ public void run() { } catch (Exception ex) { //catching exceptions here makes sure that the thread doesn't die in case of unexpected //exceptions - LOG.warn( - "Exception received while cleaning notifications. More details can be found in debug mode" - + ex.getMessage()); - LOG.debug(ex.getMessage(), ex); Review comment: There is no point in dropping a warning with just the exception message and asking user to turn on debug level to investigate further. An exception has happened and if we are giving out "warn"ing better we also log the exception accompanying it instead of a separate debug message. WARN has higher priority than debug. It will make the logs slightly verbose but will be more useful in production environment where changing log level isn't that easy. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257201191 ## File path: hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java ## @@ -1142,10 +1142,8 @@ public void run() { } catch (Exception ex) { //catching exceptions here makes sure that the thread doesn't die in case of unexpected //exceptions - LOG.warn( - "Exception received while cleaning notifications. More details can be found in debug mode" - + ex.getMessage()); - LOG.debug(ex.getMessage(), ex); Review comment: Why this debug log is removed? Not sure, if it's there for any security issues. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257201879 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -555,6 +555,9 @@ public static ConfVars getMetaConf(String name) { EVENT_DB_LISTENER_TTL("metastore.event.db.listener.timetolive", "hive.metastore.event.db.listener.timetolive", 86400, TimeUnit.SECONDS, "time after which events will be removed from the database listener queue"), +CLEAN_MAX_EVENTS("metastore.event.db.clean.maxevents", Review comment: EVENT_ prefix is used for other related configs. Can use the same here as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
sankarh commented on a change in pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537#discussion_r257202597 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java ## @@ -555,6 +555,9 @@ public static ConfVars getMetaConf(String name) { EVENT_DB_LISTENER_TTL("metastore.event.db.listener.timetolive", "hive.metastore.event.db.listener.timetolive", 86400, TimeUnit.SECONDS, "time after which events will be removed from the database listener queue"), +CLEAN_MAX_EVENTS("metastore.event.db.clean.maxevents", +"hive.metastore.event.db.clean.maxevents", 1, +"Limit on number events to be cleaned at a time in metastore cleanNotificationEvents call, to avoid OOM"), Review comment: Also, shall mention that if passed as 0 or negative values, then all expired events will be cleaned. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536#discussion_r257177860 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java ## @@ -154,6 +154,32 @@ private void initialize(String cmRoot, String externalTableWarehouseRoot, String MetaStoreTestUtils.startMetaStoreWithRetry(hiveConf, true); +// Add the below mentioned dependency in metastore/pom.xml file. For postgres need to copy postgresql-42.2.1.jar to +// .m2//repository/postgresql/postgresql/9.3-1102.jdbc41/postgresql-9.3-1102.jdbc41.jar. +/* + + mysql + mysql-connector-java + 8.0.15 + + + Review comment: Remove these irrelevant comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536#discussion_r257195513 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java ## @@ -419,4 +419,11 @@ public boolean isMigratingToExternalTable() { public void setMigratingToExternalTable() { isMigratingToExternalTable = true; } + + public static void setReplId(Map srcParameter, Map destParameter) { Review comment: Can change name to copyLastReplId. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashutosh-bapat opened a new pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
ashutosh-bapat opened a new pull request #537: HIVE-19430 : ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events URL: https://github.com/apache/hive/pull/537 See more details about the fix in HIVE-19430. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ashutosh-bapat closed pull request #447: HIVE-20708: Load an external table as an external table on target with the same location as on the source
ashutosh-bapat closed pull request #447: HIVE-20708: Load an external table as an external table on target with the same location as on the source URL: https://github.com/apache/hive/pull/447 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536#discussion_r257178408 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java ## @@ -419,4 +419,11 @@ public boolean isMigratingToExternalTable() { public void setMigratingToExternalTable() { isMigratingToExternalTable = true; } + + public static void setReplId(Map srcParameter, Map destParameter) { Review comment: This fix is irrelevant to the jira description. Can we make it another jira with scenario? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani commented on issue #532: HIVE-685 GenericUDFQuote
rmsmani commented on issue #532: HIVE-685 GenericUDFQuote URL: https://github.com/apache/hive/pull/532#issuecomment-463991962 Code merged as available in Jira ticket So closing. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani closed pull request #532: HIVE-685 GenericUDFQuote
rmsmani closed pull request #532: HIVE-685 GenericUDFQuote URL: https://github.com/apache/hive/pull/532 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh closed pull request #535: HIVE-21269: Mandate -update and -delete as DistCp options to sync data files for external tables replication.
sankarh closed pull request #535: HIVE-21269: Mandate -update and -delete as DistCp options to sync data files for external tables replication. URL: https://github.com/apache/hive/pull/535 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536#discussion_r257178408 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSpec.java ## @@ -419,4 +419,11 @@ public boolean isMigratingToExternalTable() { public void setMigratingToExternalTable() { isMigratingToExternalTable = true; } + + public static void setReplId(Map srcParameter, Map destParameter) { Review comment: This fix is irrelevant to the jira description. Can we make it another jira with scenario? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh closed pull request #533: HIVE-21261: Incremental REPL LOAD adds redundant COPY and MOVE tasks for external table events.
sankarh closed pull request #533: HIVE-21261: Incremental REPL LOAD adds redundant COPY and MOVE tasks for external table events. URL: https://github.com/apache/hive/pull/533 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
sankarh commented on a change in pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536#discussion_r257177860 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/WarehouseInstance.java ## @@ -154,6 +154,32 @@ private void initialize(String cmRoot, String externalTableWarehouseRoot, String MetaStoreTestUtils.startMetaStoreWithRetry(hiveConf, true); +// Add the below mentioned dependency in metastore/pom.xml file. For postgres need to copy postgresql-42.2.1.jar to +// .m2//repository/postgresql/postgresql/9.3-1102.jdbc41/postgresql-9.3-1102.jdbc41.jar. +/* + + mysql + mysql-connector-java + 8.0.15 + + + Review comment: Remove these irrelevant comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] maheshk114 opened a new pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue
maheshk114 opened a new pull request #536: HIVE-21260 : Hive 3 (onprem) -> 4(onprem): Hive replication failed due to postgres sql execution issue URL: https://github.com/apache/hive/pull/536 … This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HIVE-21274) Geode adapter: support boolean columns in filter predicates
Zoltan Haindrich created HIVE-21274: --- Summary: Geode adapter: support boolean columns in filter predicates Key: HIVE-21274 URL: https://issues.apache.org/jira/browse/HIVE-21274 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich The geode adapter currently only supports filtering by comparisions. After CALCITE-2839 the simplifier may remove the comparision in case the operands are booleans. {{x=true => x)}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] rmsmani edited a comment on issue #513: fix null string check for C++ using thrift api
rmsmani edited a comment on issue #513: fix null string check for C++ using thrift api URL: https://github.com/apache/hive/pull/513#issuecomment-463971593 What's the JIRA number for this. If Jira ticket is not available for this, create the ticket in Below URL, (under HIVE project) https://issues.apache.org/jira/projects/HIVE create the patch as given in the documentation https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-UnderstandingHiveBranches So that GIT PRE-COMMIT testing will be done automatically. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani commented on issue #504: fix the UGI problem when reading ORC files
rmsmani commented on issue #504: fix the UGI problem when reading ORC files URL: https://github.com/apache/hive/pull/504#issuecomment-463974347 What's the JIRA number for this. If Jira ticket is not available for this, create the ticket in Below URL, (under HIVE project) https://issues.apache.org/jira/projects/HIVE create the patch as given in the documentation https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-UnderstandingHiveBranches So that GIT PRE-COMMIT testing will be done automatically. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani commented on issue #519: HIVE-21141: Fix some spell errors in Hive
rmsmani commented on issue #519: HIVE-21141: Fix some spell errors in Hive URL: https://github.com/apache/hive/pull/519#issuecomment-463972261 Details given in JIRA Ticket This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani commented on issue #510: HIVE-21072: Fix NPE when running partitioned CTAS statements
rmsmani commented on issue #510: HIVE-21072: Fix NPE when running partitioned CTAS statements URL: https://github.com/apache/hive/pull/510#issuecomment-463972341 Create the git patch file and apply in the JIRA Ticket, so that PRE-COMMIT testing will be done automatically. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmsmani commented on issue #513: fix null string check for C++ using thrift api
rmsmani commented on issue #513: fix null string check for C++ using thrift api URL: https://github.com/apache/hive/pull/513#issuecomment-463971593 What is the JIRA ticket number for this This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HIVE-21273) Your project apache/hive is using buggy third-party libraries [WARNING]
Kaifeng Huang created HIVE-21273: Summary: Your project apache/hive is using buggy third-party libraries [WARNING] Key: HIVE-21273 URL: https://issues.apache.org/jira/browse/HIVE-21273 Project: Hive Issue Type: Bug Reporter: Kaifeng Huang Hi, there! We are a research team working on third-party library analysis. We have found that some widely-used third-party libraries in your project have major/critical bugs, which will degrade the quality of your project. We highly recommend you to update those libraries to new versions. We have attached the buggy third-party libraries and corresponding jira issue links below for you to have more detailed information. 1. org.apache.httpcomponents httpclient(pom.xml) version: 4.5.2 Jira issues: org.apache.http.impl.client.AbstractHttpClient#createClientConnectionManager Does not account for context class loader affectsVersions:4.4.1;4.5;4.5.1;4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1727?filter=allopenissues Memory Leak in OSGi support affectsVersions:4.4.1;4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1749?filter=allopenissues SystemDefaultRoutePlanner: Possible null pointer dereference affectsVersions:4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1766?filter=allopenissues Null pointer dereference in EofSensorInputStream and ResponseEntityProxy affectsVersions:4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1767?filter=allopenissues [OSGi] WeakList needs to support "clear" method affectsVersions:4.5.2;5.0 Alpha1 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1772?filter=allopenissues [OSGi] HttpProxyConfigurationActivator does not unregister HttpClientBuilderFactory affectsVersions:4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1773?filter=allopenissues Why is Retry around Redirect and not the other way round affectsVersions:4.5.2 https://issues.apache.org/jira/projects/HTTPCLIENT/issues/HTTPCLIENT-1800?filter=allopenissues 2. commons-cli commons-cli(pom.xml,testutils/ptest2/pom.xml,upgrade-acid/pre-upgrade/pom.xml) version: 1.2 Jira issues: Unable to select a pure long option in a group affectsVersions:1.0;1.1;1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-182?filter=allopenissues Clear the selection from the groups before parsing affectsVersions:1.0;1.1;1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-183?filter=allopenissues Commons CLI incorrectly stripping leading and trailing quotes affectsVersions:1.1;1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-185?filter=allopenissues Coding error: OptionGroup.setSelected causes java.lang.NullPointerException affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-191?filter=allopenissues StringIndexOutOfBoundsException in HelpFormatter.findWrapPos affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-193?filter=allopenissues HelpFormatter strips leading whitespaces in the footer affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-207?filter=allopenissues OptionBuilder only has static methods; yet many return an OptionBuilder instance affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-224?filter=allopenissues Unable to properly require options affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-230?filter=allopenissues OptionValidator Implementation Does Not Agree With JavaDoc affectsVersions:1.2 https://issues.apache.org/jira/projects/CLI/issues/CLI-241?filter=allopenissues 3. commons-io commons-io(pom.xml) version: 2.4 Jira issues: IOUtils copyLarge() and skip() methods are performance hogs affectsVersions:2.3;2.4 https://issues.apache.org/jira/projects/IO/issues/IO-355?filter=allopenissues CharSequenceInputStream#reset() behaves incorrectly in case when buffer size is not dividable by data size affectsVersions:2.4 https://issues.apache.org/jira/projects/IO/issues/IO-356?filter=allopenissues [Tailer] InterruptedException while the thead is sleeping is silently ignored affectsVersions:2.4 https://issues.apache.org/jira/projects/IO/issues/IO-357?filter=allopenissues IOUtils.contentEquals* methods returns false if input1 ==