[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900854#comment-15900854 ] Tristan Stevens commented on HIVE-11266: If Hive is still serving results directly from the stats then with external tables it cannot guarantee their accuracy. > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900851#comment-15900851 ] Pengcheng Xiong commented on HIVE-11266: I see. We changed a lot since then. This should be already fixed in the recent Hive versions. > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900832#comment-15900832 ] Simone commented on HIVE-11266: --- It was Hive 1.1.0 in CDH distribution > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15468) Enhance the vectorized execution engine to support complex types
[ https://issues.apache.org/jira/browse/HIVE-15468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi reassigned HIVE-15468: - Assignee: Teddy Choi > Enhance the vectorized execution engine to support complex types > > > Key: HIVE-15468 > URL: https://issues.apache.org/jira/browse/HIVE-15468 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Reporter: Chao Sun >Assignee: Teddy Choi > > Currently Hive's vectorized execution engine only supports scalar types, as > documented here: > https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution. > To be complete, we should add support for complex types as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15001) Remove showConnectedUrl from command line help
[ https://issues.apache.org/jira/browse/HIVE-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900826#comment-15900826 ] Hive QA commented on HIVE-15001: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12834160/HIVE-15001.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10332 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4011/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4011/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4011/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12834160 - PreCommit-HIVE-Build > Remove showConnectedUrl from command line help > -- > > Key: HIVE-15001 > URL: https://issues.apache.org/jira/browse/HIVE-15001 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Trivial > Attachments: HIVE-15001.2.patch, HIVE-15001.3.patch, HIVE-15001.patch > > > As discussed with [~nemon], the showConnectedUrl commandline parameter is not > working since a erroneous merge. Instead beeline always prints the currently > connected url. Since it is good for everyone, no extra parameter is needed to > turn this feature on. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-10494) hive metastore server can't release its java heap with no work on it
[ https://issues.apache.org/jira/browse/HIVE-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900798#comment-15900798 ] Zhaofei Meng edited comment on HIVE-10494 at 3/8/17 6:40 AM: - Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction 60. was (Author: 5feixiang): Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction smaller. > hive metastore server can't release its java heap with no work on it > > > Key: HIVE-10494 > URL: https://issues.apache.org/jira/browse/HIVE-10494 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.13.0 > Environment: cloudera cdh 5.2.0 > 10 nodes > 128G ram, 10T disk, 32core CPU for each node > using impala for data analysis >Reporter: liqida > > I use impala for data analysis > after a long time runing , impala DDL statements need a long time to complete > "Planning finished" and "DML Metastore update finished" steps. > Both of them take 50 seconds and more. > I found that HMS java heap has affected it so much .and I restart the hive > metastore server , the problem fixed . > The HMS java ops like this : > -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC > -XX:-CMSConcurrentMTEnabled > -XX:CMSInitiatingOccupancyFraction=70 > -XX:+CMSParallelRemarkEnabled > -XX:+UseCMSCompactAtFullCollection > -XX:CMSFullGCsBeforeCompaction=0 > -XX:SurvivorRatio=1 > and the total heap size is 3GB > after 3 days or less , I found the old genaration is full , and no matter > what kind of GC I tried , it never works . > And then , after the whole work is done() , I run " jmap -F -histo PID " > I found this : > Object Histogram: > num #instances#bytes Class description > -- > 1: 3955457 696160432 com.mysql.jdbc.JDBC4ResultSet > 2: 3942714 630834240 com.mysql.jdbc.StatementImpl > 3: 4051520 194472960 java.util.HashMap > 4: 4714330 150858560 java.util.HashMap$Entry > 5: 3990264 63844224java.util.HashSet > 6: 3978657 63658512java.util.HashMap$KeySet > 7: 3955458 63463696com.mysql.jdbc.Field[] > 8: 3964025 63424400 > java.util.concurrent.atomic.AtomicBoolean > 9: 3961293 63380688java.lang.Object > I think this is the causation > So, what can I do with this , should I change some configuration or do > something to fix this , or HMS has any CACHE ? THANKS > BTW: Hive version 0.13.0 , I only use impala -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-10494) hive metastore server can't release its java heap with no work on it
[ https://issues.apache.org/jira/browse/HIVE-10494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900798#comment-15900798 ] Zhaofei Meng commented on HIVE-10494: - Try to adjust jvm parameter and set CMSInitiatingOccupancyFraction smaller. > hive metastore server can't release its java heap with no work on it > > > Key: HIVE-10494 > URL: https://issues.apache.org/jira/browse/HIVE-10494 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.13.0 > Environment: cloudera cdh 5.2.0 > 10 nodes > 128G ram, 10T disk, 32core CPU for each node > using impala for data analysis >Reporter: liqida > > I use impala for data analysis > after a long time runing , impala DDL statements need a long time to complete > "Planning finished" and "DML Metastore update finished" steps. > Both of them take 50 seconds and more. > I found that HMS java heap has affected it so much .and I restart the hive > metastore server , the problem fixed . > The HMS java ops like this : > -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC > -XX:-CMSConcurrentMTEnabled > -XX:CMSInitiatingOccupancyFraction=70 > -XX:+CMSParallelRemarkEnabled > -XX:+UseCMSCompactAtFullCollection > -XX:CMSFullGCsBeforeCompaction=0 > -XX:SurvivorRatio=1 > and the total heap size is 3GB > after 3 days or less , I found the old genaration is full , and no matter > what kind of GC I tried , it never works . > And then , after the whole work is done() , I run " jmap -F -histo PID " > I found this : > Object Histogram: > num #instances#bytes Class description > -- > 1: 3955457 696160432 com.mysql.jdbc.JDBC4ResultSet > 2: 3942714 630834240 com.mysql.jdbc.StatementImpl > 3: 4051520 194472960 java.util.HashMap > 4: 4714330 150858560 java.util.HashMap$Entry > 5: 3990264 63844224java.util.HashSet > 6: 3978657 63658512java.util.HashMap$KeySet > 7: 3955458 63463696com.mysql.jdbc.Field[] > 8: 3964025 63424400 > java.util.concurrent.atomic.AtomicBoolean > 9: 3961293 63380688java.lang.Object > I think this is the causation > So, what can I do with this , should I change some configuration or do > something to fix this , or HMS has any CACHE ? THANKS > BTW: Hive version 0.13.0 , I only use impala -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16123) Let user pick the granularity of bucketing and max in row memory
[ https://issues.apache.org/jira/browse/HIVE-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900778#comment-15900778 ] Hive QA commented on HIVE-16123: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/1285/HIVE-16123.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10332 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=224) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4010/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4010/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4010/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 1285 - PreCommit-HIVE-Build > Let user pick the granularity of bucketing and max in row memory > > > Key: HIVE-16123 > URL: https://issues.apache.org/jira/browse/HIVE-16123 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16123.2.patch, HIVE-16123.patch > > > Currently we index the data with granularity of none which puts lot of > pressure on the indexer. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats
[ https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15903: --- Status: Patch Available (was: Open) > Compute table stats when user computes column stats > --- > > Key: HIVE-15903 > URL: https://issues.apache.org/jira/browse/HIVE-15903 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, > HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, > HIVE-15903.06.patch, HIVE-15903.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats
[ https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15903: --- Status: Open (was: Patch Available) > Compute table stats when user computes column stats > --- > > Key: HIVE-15903 > URL: https://issues.apache.org/jira/browse/HIVE-15903 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, > HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, > HIVE-15903.06.patch, HIVE-15903.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats
[ https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15903: --- Attachment: HIVE-15903.07.patch > Compute table stats when user computes column stats > --- > > Key: HIVE-15903 > URL: https://issues.apache.org/jira/browse/HIVE-15903 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, > HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, > HIVE-15903.06.patch, HIVE-15903.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900728#comment-15900728 ] Rui Li commented on HIVE-16071: --- Hi [~xuefuz], let me summarise my point: here we're talking about two issues - detecting disconnection and react to the disconnection. I think the root cause of your example is we don't react properly (i.e. we don't fail the future) on disconnection. Regarding detecting the disconnection, I suppose we can rely on netty. The cancelTask is kind of a further insurance in case netty fails (or takes too long) to detect it. bq. let cancelTask fail the Future so that Hive stops waiting Like I mentioned in my proposal, I think SaslHandler is in a better place to do this. SaslHandler is intended for the SASL handshake, and it removes itself from the pipeline once the handshake finishes. Therefore, if SaslHandler detects disconnection, it means the channel is closed before the handshake finishes. And thus we should fail the Future. Do you think it makes sense to open another JIRA for this? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14550) HiveServer2: enable ThriftJDBCBinarySerde use by default
[ https://issues.apache.org/jira/browse/HIVE-14550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900724#comment-15900724 ] Hive QA commented on HIVE-14550: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856658/HIVE-14550.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10332 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable (batchId=216) org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData (batchId=216) org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant (batchId=216) org.apache.hive.jdbc.TestJdbcWithMiniLlap.testEscapedStrings (batchId=218) org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd (batchId=218) org.apache.hive.jdbc.TestJdbcWithMiniLlap.testNonAsciiStrings (batchId=218) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4009/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4009/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4009/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856658 - PreCommit-HIVE-Build > HiveServer2: enable ThriftJDBCBinarySerde use by default > > > Key: HIVE-14550 > URL: https://issues.apache.org/jira/browse/HIVE-14550 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC, ODBC >Affects Versions: 2.1.0 >Reporter: Vaibhav Gumashta >Assignee: Ziyang Zhao > Attachments: HIVE-14550.1.patch, HIVE-14550.1.patch, > HIVE-14550.2.patch > > > We've covered all items in HIVE-12427 and created HIVE-14549 for part2 of the > effort. Before closing the umbrella jira, we should enable this feature by > default. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.
[ https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900706#comment-15900706 ] Sankar Hariappan commented on HIVE-16006: - Thank you [~sushanth] for the commit! > Incremental REPL LOAD Inserts doesn't operate on the target database if name > differs from source database. > -- > > Key: HIVE-16006 > URL: https://issues.apache.org/jira/browse/HIVE-16006 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Fix For: 2.2.0 > > Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, > HIVE-16006.03.patch > > > During "Incremental Load", it is not considering the database name input in > the command line. Hence load doesn't happen. At the same time, database with > original name is getting modified. > Steps: > 1. INSERT INTO default.tbl values (10, 20); > 2. REPL DUMP default FROM 52; > 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621'; > – This step modifies the default Db instead of replDb. > == > Additional note - this is happening for INSERT events, not other events. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900703#comment-15900703 ] Xuefu Zhang commented on HIVE-16071: Hi [~lirui], thank for your input and experiment. I think we are making some progress in drawing a conclusion. {quote} If no SaslMessage is sent, Hive will still wait for hive.spark.client.server.connect.timeout, even if cancelTask closes the channel after 1s. {quote} I'm particularly concern on cases where Hive takes more than it needs to detect a problem and return the error to the user. In this case, Hive should know in 1s that Sasl handshake doesn't complete. It doesn't make sense to let user know the failure until after 1 hr. (1 hr is set to accommodate the resource availability, not connection establishment.) {quote} the cancelTask only closes the channel, it doesn't set failure to the Future. {quote} This is a good observation. Is this another bug that we should fix? That is, let cancelTask fail the Future so that Hive stops waiting until server.connect.timeout elapses. Any further thoughts? [~ctang.ma], To answer your question, I don't think we need another property. We should use client.connect.timeout as it's also used on driver side. If the default value is too low, we can bump it up. > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Attachment: (was: HIVE-12274.patch) > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Attachment: (was: HIVE-12274.2.patch) > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Attachment: (was: HIVE-12274.patch) > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.3.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16115) Stop printing progress info from operation logs with beeline progress bar
[ https://issues.apache.org/jira/browse/HIVE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-16115: --- Attachment: HIVE-16115.3.patch patch after master rebase from upstream as the merge was failing in previous build on apache > Stop printing progress info from operation logs with beeline progress bar > - > > Key: HIVE-16115 > URL: https://issues.apache.org/jira/browse/HIVE-16115 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 2.2.0 >Reporter: anishek >Assignee: anishek >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16115.1.patch, HIVE-16115.2.patch, > HIVE-16115.3.patch > > > when in progress bar is enabled, we should not print the progress information > via the operations logs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau
[ https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887496#comment-15887496 ] Lefty Leverenz edited comment on HIVE-12492 at 3/8/17 4:12 AM: --- Doc note: This adds *hive.auto.convert.join.hashtable.max.entries* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Added a TODOC2.2 label. Edit (7/Mar/17): HIVE-16137 changes the default value to 40,000,000 so that's what should be documented in the wiki. Typo alert: In the parameter description, "does not take affect" should be "does not take effect." This can be corrected in the wiki. was (Author: le...@hortonworks.com): Doc note: This adds *hive.auto.convert.join.hashtable.max.entries* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Added a TODOC2.2 label. Typo alert: In the parameter description, "does not take affect" should be "does not take effect." This can be corrected in the wiki. > MapJoin: 4 million unique integers seems to be a probe plateau > -- > > Key: HIVE-12492 > URL: https://issues.apache.org/jira/browse/HIVE-12492 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-12492.01.patch, HIVE-12492.02.patch, > HIVE-12492.patch > > > After 4 million keys, the map-join implementation seems to suffer from a > performance degradation. > The hashtable build & probe time makes this very inefficient, even if the > data is very compact (i.e 2 ints). > Falling back onto the shuffle join or bucket map-join is useful after 2^22 > items. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16137) Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m
[ https://issues.apache.org/jira/browse/HIVE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900666#comment-15900666 ] Lefty Leverenz commented on HIVE-16137: --- Doc note: This changes the default value of *hive.auto.convert.join.hashtable.max.entries*, which was created by HIVE-12492 (also for release 2.2.0) and is not yet documented in the wiki. I'll update the doc note on HIVE-12492. Added a TODOC2.2 label. > Default value of hive config hive.auto.convert.join.hashtable.max.entries > should be set to 40m instead of 4m > > > Key: HIVE-16137 > URL: https://issues.apache.org/jira/browse/HIVE-16137 > Project: Hive > Issue Type: Improvement > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-16137.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16137) Default value of hive config hive.auto.convert.join.hashtable.max.entries should be set to 40m instead of 4m
[ https://issues.apache.org/jira/browse/HIVE-16137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-16137: -- Labels: TODOC2.2 (was: ) > Default value of hive config hive.auto.convert.join.hashtable.max.entries > should be set to 40m instead of 4m > > > Key: HIVE-16137 > URL: https://issues.apache.org/jira/browse/HIVE-16137 > Project: Hive > Issue Type: Improvement > Components: Configuration >Affects Versions: 2.2.0 >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-16137.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16064) Allow ALL set quantifier with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900650#comment-15900650 ] Lefty Leverenz commented on HIVE-16064: --- Doc note: This should be documented in the wiki, with version information. * [LanguageManualSelect -- ALL and DISTINCT Clauses | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-ALLandDISTINCTClauses] * [Hive Operators and UDFs -- Built-in Aggregate Functions (UDAF) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] Added a TODOC2.2 label. > Allow ALL set quantifier with aggregate functions > - > > Key: HIVE-16064 > URL: https://issues.apache.org/jira/browse/HIVE-16064 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch > > > SQL:2011 allows ALL with aggregate functions which is > equivalent to aggregate function without ALL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-13567) Auto-gather column stats - phase 2
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900646#comment-15900646 ] Hive QA commented on HIVE-13567: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856659/HIVE-13567.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 393 failed/errored test(s), 10332 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=220) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert] (batchId=220) org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join14] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join17] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join19_inclause] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join1] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join26] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join2] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join3] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join4] (batchId=65) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join5] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join6] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join7] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join9] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_13] (batchId=59) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[binary_output_format] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket1] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket2] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket3] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark1] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] (batchId=2) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark3] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark4] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin13] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin5] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative2] (batchId=63) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative] (batchId=21) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_1] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_3] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_4] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_5] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_8] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[case_sensitivity] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cast1] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join17] (batchId=24)
[jira] [Commented] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.
[ https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900642#comment-15900642 ] Sushanth Sowmyan commented on HIVE-16006: - (Forgot to explicitly mention in previous comment, this patch has my +1. :) ) > Incremental REPL LOAD Inserts doesn't operate on the target database if name > differs from source database. > -- > > Key: HIVE-16006 > URL: https://issues.apache.org/jira/browse/HIVE-16006 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Fix For: 2.2.0 > > Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, > HIVE-16006.03.patch > > > During "Incremental Load", it is not considering the database name input in > the command line. Hence load doesn't happen. At the same time, database with > original name is getting modified. > Steps: > 1. INSERT INTO default.tbl values (10, 20); > 2. REPL DUMP default FROM 52; > 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621'; > – This step modifies the default Db instead of replDb. > == > Additional note - this is happening for INSERT events, not other events. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16066) NPE in ExplainTask
[ https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong resolved HIVE-16066. Resolution: Fixed > NPE in ExplainTask > -- > > Key: HIVE-16066 > URL: https://issues.apache.org/jira/browse/HIVE-16066 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong >Priority: Minor > > {noformat} > 2017-02-28T20:05:13,412 WARN [ATS Logger 0] hooks.ATSHook: Failed to submit > plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99 > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:251) >
[jira] [Assigned] (HIVE-16066) NPE in ExplainTask
[ https://issues.apache.org/jira/browse/HIVE-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-16066: -- Assignee: Pengcheng Xiong > NPE in ExplainTask > -- > > Key: HIVE-16066 > URL: https://issues.apache.org/jira/browse/HIVE-16066 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Pengcheng Xiong >Priority: Minor > > {noformat} > 2017-02-28T20:05:13,412 WARN [ATS Logger 0] hooks.ATSHook: Failed to submit > plan to ATS for user_20170228200511_b05d6eaf-7599-4539-919c-5d3df8658c99 > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:803) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:817) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:658) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:984) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:592) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:970) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:691) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1059) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1203) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:306) > ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:251) >
[jira] [Updated] (HIVE-16006) Incremental REPL LOAD Inserts doesn't operate on the target database if name differs from source database.
[ https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-16006: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks, [~sankarh] > Incremental REPL LOAD Inserts doesn't operate on the target database if name > differs from source database. > -- > > Key: HIVE-16006 > URL: https://issues.apache.org/jira/browse/HIVE-16006 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Fix For: 2.2.0 > > Attachments: HIVE-16006.01.patch, HIVE-16006.02.patch, > HIVE-16006.03.patch > > > During "Incremental Load", it is not considering the database name input in > the command line. Hence load doesn't happen. At the same time, database with > original name is getting modified. > Steps: > 1. INSERT INTO default.tbl values (10, 20); > 2. REPL DUMP default FROM 52; > 3. REPL LOAD replDb FROM '/tmp/dump/1487588522621'; > – This step modifies the default Db instead of replDb. > == > Additional note - this is happening for INSERT events, not other events. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Status: Patch Available (was: Open) I am attaching new patch that adds a couple of fixes more, one to a MetaSToreUtils where it checks for the length of Column TypeName. This check needs to be removed. Second fix is to the QTestUtil file that bulk loads some data into derby. > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, > HIVE-12274.3.patch, HIVE-12274.example.ddl.hql, HIVE-12274.patch, > HIVE-12274.patch, HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Attachment: HIVE-12274.3.patch > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, > HIVE-12274.3.patch, HIVE-12274.example.ddl.hql, HIVE-12274.patch, > HIVE-12274.patch, HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12274: - Status: Open (was: Patch Available) > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, > HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900633#comment-15900633 ] Naveen Gangam commented on HIVE-12274: -- I am currently investigating test failures above from the TestPerfCliDriver. All failures seem to be from difference in the qtest output. I narrowed down the cause. The expected output is from CBO enabled. The actual output is a result of CBO being disabled because the test is unable to bulk load data into HMS metastore. This bulk loader uses a little known feature in derby to import the data from a txt file. Since we are changing the type of TABLE_PARAMS.PARAM_VALUE to CLOB, there format for the data needs to be different. Looking at the code, the CLOB column DATA needs to be separated into its own file and the original data file needs to have the filename, offset and the data length to read from. This is my understanding based on reading the code from. http://people.apache.org/~kristwaa/jacoco/org.apache.derby.impl.load/ImportLobFile.java.html I have been able to get past the initial failure, but CBO fails further along without clear message. [~thejas] Who can I approach to understand these CBO failures? Thanks > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, > HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or
[jira] [Updated] (HIVE-15212) merge branch into master
[ https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15212: Attachment: HIVE-15212.patch The branch is not ready for merge due to ACID merge being in progress... to make some preliminary progress on the merge, attaching the current branch patch to see what non-MM (or MM) tests would need to be fixed after fixing all the MM issues discovered in HIVE-14990 cc [~wzheng] fyi > merge branch into master > > > Key: HIVE-15212 > URL: https://issues.apache.org/jira/browse/HIVE-15212 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15212.patch > > > Filing the JIRA now; accidentally attached the merge patch somewhere, so I > will post the test results analysis here. We will re-run the tests here later. > Relevant q file failures: > load_dyn_part1, autoColumnStats_2 and _1, escape2, load_dyn_part2, > dynpart_sort_opt_vectorization, orc_createas1, combine3, update_tmp_table, > delete_where_non_partitioned, delete_where_no_match, update_where_no_match, > update_where_non_partitioned, update_all_types > I suspect many ACID failures are due to incomplete ACID type patch. > Also need to revert the pom change from spark test pom, that seems to break > Spark tests. I had it temporarily to get rid of the long non-maven download > in all cases (there's a separate JIRA for that) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900627#comment-15900627 ] Rui Li commented on HIVE-16071: --- Hi [~xuefuz], in your example, if the SASL handshake doesn't finish in time, the client side will exit after 1s. Even if netty can't detect the disconnection immediately, I don't think it takes 1h to detect it. Besides, the cancelTask only closes the channel, it doesn't set failure to the Future. Therefore we can't really rely on the cancelTask to stop the waiting. My proposal is: # We need to reliably detect disconnection. I think netty is good enough for this (maybe with some reasonable delay). But I'm also OK to keep the cancelTask to close the channel ourselves. # We need to reliably cancel the Future when disconnection is detected. This can be done in the SaslHandler which monitors the channel inactive event. I also did some tests to verify. I modified the client code so that it makes the connection but doesn't finish SASL handshake. I tried two ways to do this, one is the client never sends the SaslMessage, the other is the client sends the SaslMessage and then just exits. The test is done in yarn-cluster mode. # If no SaslMessage is sent, Hive will still wait for {{hive.spark.client.server.connect.timeout}}, even if cancelTask closes the channel after 1s. # If SaslMessage is sent, SaslHandler will detect the disconnection and cancel the Future, no matter whether the cancelTask fires or not. Of course, this requires netty to detect the disconnection. > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15212) merge branch into master
[ https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-15212: --- Assignee: Sergey Shelukhin > merge branch into master > > > Key: HIVE-15212 > URL: https://issues.apache.org/jira/browse/HIVE-15212 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Filing the JIRA now; accidentally attached the merge patch somewhere, so I > will post the test results analysis here. We will re-run the tests here later. > Relevant q file failures: > load_dyn_part1, autoColumnStats_2 and _1, escape2, load_dyn_part2, > dynpart_sort_opt_vectorization, orc_createas1, combine3, update_tmp_table, > delete_where_non_partitioned, delete_where_no_match, update_where_no_match, > update_where_non_partitioned, update_all_types > I suspect many ACID failures are due to incomplete ACID type patch. > Also need to revert the pom change from spark test pom, that seems to break > Spark tests. I had it temporarily to get rid of the long non-maven download > in all cases (there's a separate JIRA for that) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16038) MM tables: fix (or disable) inferring buckets
[ https://issues.apache.org/jira/browse/HIVE-16038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-16038. - Resolution: Fixed Fix Version/s: hive-14535 Would be very easy to fix for a particular MM ID, but there's no guarantee that other MM IDs would conform to the inferred buckets, so I added comments and warnings and let it continue to fail (by discarding the inferred data, as it does already when the job doesn't produce the requisite number of files for a partition, see _dyn_part test). I suspect similar issues may affect ACID tables and any other nested directory cases (and some overwrites?). If somebody cares about this feature it should be easy to fix based on the comment added in the patch. > MM tables: fix (or disable) inferring buckets > - > > Key: HIVE-16038 > URL: https://issues.apache.org/jira/browse/HIVE-16038 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > > The following tests on minimr produce diffs if all tables are changed to MM: > {noformat} > infer_bucket_sort_dyn_part > infer_bucket_sort_num_buckets > infer_bucket_sort_merge > infer_bucket_sort_reducers_power_two > {noformat} > Some of these disable strict checks for bucketing load, which wouldn't work > by design; the rest should work. Either that, or we should disable this for > MM tables - seems like an obscure feature. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException
[ https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900619#comment-15900619 ] Vaibhav Gumashta commented on HIVE-16107: - [~daijy] [~sushanth] Can you please take a look? Thanks > JDBC: HttpClient should retry one more time on NoHttpResponseException > -- > > Key: HIVE-16107 > URL: https://issues.apache.org/jira/browse/HIVE-16107 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 2.0.1, 2.1.1 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-16107.1.patch > > > Hive's JDBC client in HTTP transport mode doesn't retry on > NoHttpResponseException. We've seen the exception being thrown to the JDBC > end user when used with Knox as the proxy, when Knox upgraded its jetty > version, which has a smaller value for jetty connector idletimeout, and as a > result closes the HTTP connection on server side. The next jdbc query on the > client, throws a NoHttpResponseException. However, subsequent queries > reconnect, but the JDBC driver should ideally handle this by retrying. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException
[ https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-16107: Attachment: HIVE-16107.1.patch > JDBC: HttpClient should retry one more time on NoHttpResponseException > -- > > Key: HIVE-16107 > URL: https://issues.apache.org/jira/browse/HIVE-16107 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 2.0.1, 2.1.1 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-16107.1.patch > > > Hive's JDBC client in HTTP transport mode doesn't retry on > NoHttpResponseException. We've seen the exception being thrown to the JDBC > end user when used with Knox as the proxy, when Knox upgraded its jetty > version, which has a smaller value for jetty connector idletimeout, and as a > result closes the HTTP connection on server side. The next jdbc query on the > client, throws a NoHttpResponseException. However, subsequent queries > reconnect, but the JDBC driver should ideally handle this by retrying. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16107) JDBC: HttpClient should retry one more time on NoHttpResponseException
[ https://issues.apache.org/jira/browse/HIVE-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-16107: Status: Patch Available (was: Open) > JDBC: HttpClient should retry one more time on NoHttpResponseException > -- > > Key: HIVE-16107 > URL: https://issues.apache.org/jira/browse/HIVE-16107 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 2.1.1, 2.0.1 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-16107.1.patch > > > Hive's JDBC client in HTTP transport mode doesn't retry on > NoHttpResponseException. We've seen the exception being thrown to the JDBC > end user when used with Knox as the proxy, when Knox upgraded its jetty > version, which has a smaller value for jetty connector idletimeout, and as a > result closes the HTTP connection on server side. The next jdbc query on the > client, throws a NoHttpResponseException. However, subsequent queries > reconnect, but the JDBC driver should ideally handle this by retrying. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16071) Spark remote driver misuses the timeout in RPC handshake
[ https://issues.apache.org/jira/browse/HIVE-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900607#comment-15900607 ] Chaoyu Tang commented on HIVE-16071: I agree with [~xuefuz] that we need a timeout for SASL handshaking at RPC server site for the case he raised. This timeout should be shorter than client.server.connect.timeout used by RegisterClient, but ideally I think it should be a little longer than the client.connect.timeout used by RemoteDriver handshaking so that we can try to avoid the handshaking timeout initiated by the server given that starting a remoteDriver is quite expensive. If so, I would suggest we can introduce a new configuration like hive.spark.rpc.handshake.server.timeout, and rename hive.spark.client.connect.timeout to hive.spark.rpc.handshake.client.timeout (though it is also used as the socket connect timeout at RemoteDriver side like now). Also the hive.spark.client.server.connect.timeout could be renamed to something like hive.spark.register.remote.driver.timeout if necessary. What do you guys think about it? > Spark remote driver misuses the timeout in RPC handshake > > > Key: HIVE-16071 > URL: https://issues.apache.org/jira/browse/HIVE-16071 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-16071.patch > > > Based on its property description in HiveConf and the comments in HIVE-12650 > (https://issues.apache.org/jira/browse/HIVE-12650?focusedCommentId=15128979=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15128979), > hive.spark.client.connect.timeout is the timeout when the spark remote > driver makes a socket connection (channel) to RPC server. But currently it is > also used by the remote driver for RPC client/server handshaking, which is > not right. Instead, hive.spark.client.server.connect.timeout should be used > and it has already been used by the RPCServer in the handshaking. > The error like following is usually caused by this issue, since the default > hive.spark.client.connect.timeout value (1000ms) used by remote driver for > handshaking is a little too short. > {code} > 17/02/20 08:46:08 ERROR yarn.ApplicationMaster: User class threw exception: > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: > Client closed before SASL negotiation finished. > at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:156) > at > org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542) > Caused by: javax.security.sasl.SaslException: Client closed before SASL > negotiation finished. > at > org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453) > at > org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900603#comment-15900603 ] Rajesh Balamohan commented on HIVE-16142: - Is HIVE-16066 similar to this one? > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16142.01.patch > > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16144) CompactionInfo doesn't have equals/hashCode but used in Set
[ https://issues.apache.org/jira/browse/HIVE-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-16144: - > CompactionInfo doesn't have equals/hashCode but used in Set > --- > > Key: HIVE-16144 > URL: https://issues.apache.org/jira/browse/HIVE-16144 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > CompactionTxnHandler.findPotentialCompactions() uses a Set > but CompactionInfo doesn't have equals/hashCode. > should do the same as CompactionInfo.compareTo() -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15903) Compute table stats when user computes column stats
[ https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900596#comment-15900596 ] Hive QA commented on HIVE-15903: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856657/HIVE-15903.06.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 10332 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_table_stats_orc] (batchId=6) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=138) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_stats] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_table_invalidate_column_stats] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_1] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[column_table_stats] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[deleteAnalyze] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[drop_partition_with_stats] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[extrapolate_part_stats_partial_ndv] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries_with_filters] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_stats] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_only_null] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_remove_26] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join1] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join2] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join3] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join4] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join5] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction2] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=225) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=120) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4007/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4007/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4007/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856657 - PreCommit-HIVE-Build > Compute table stats when user computes column stats > --- > > Key: HIVE-15903 > URL: https://issues.apache.org/jira/browse/HIVE-15903 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch, > HIVE-15903.03.patch, HIVE-15903.04.patch, HIVE-15903.05.patch, > HIVE-15903.06.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15160) Can't order by an unselected column
[ https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15160: --- Status: Open (was: Patch Available) > Can't order by an unselected column > --- > > Key: HIVE-15160 > URL: https://issues.apache.org/jira/browse/HIVE-15160 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, > HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, > HIVE-15160.07.patch, HIVE-15160.08.patch > > > If a grouping key hasn't been selected, Hive complains. For comparison, > Postgres does not. > Example. Notice i_item_id is not selected: > {code} > select i_item_desc >,i_category >,i_class >,i_current_price >,sum(cs_ext_sales_price) as itemrevenue >,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over >(partition by i_class) as revenueratio > from catalog_sales > ,item > ,date_dim > where cs_item_sk = i_item_sk >and i_category in ('Jewelry', 'Sports', 'Books') >and cs_sold_date_sk = d_date_sk > and d_date between cast('2001-01-12' as date) > and (cast('2001-01-12' as date) + 30 days) > group by i_item_id > ,i_item_desc > ,i_category > ,i_class > ,i_current_price > order by i_category > ,i_class > ,i_item_id > ,i_item_desc > ,revenueratio > limit 100; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15160) Can't order by an unselected column
[ https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15160: --- Status: Patch Available (was: Open) > Can't order by an unselected column > --- > > Key: HIVE-15160 > URL: https://issues.apache.org/jira/browse/HIVE-15160 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, > HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, > HIVE-15160.07.patch, HIVE-15160.08.patch > > > If a grouping key hasn't been selected, Hive complains. For comparison, > Postgres does not. > Example. Notice i_item_id is not selected: > {code} > select i_item_desc >,i_category >,i_class >,i_current_price >,sum(cs_ext_sales_price) as itemrevenue >,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over >(partition by i_class) as revenueratio > from catalog_sales > ,item > ,date_dim > where cs_item_sk = i_item_sk >and i_category in ('Jewelry', 'Sports', 'Books') >and cs_sold_date_sk = d_date_sk > and d_date between cast('2001-01-12' as date) > and (cast('2001-01-12' as date) + 30 days) > group by i_item_id > ,i_item_desc > ,i_category > ,i_class > ,i_current_price > order by i_category > ,i_class > ,i_item_id > ,i_item_desc > ,revenueratio > limit 100; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15160) Can't order by an unselected column
[ https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-15160: --- Attachment: HIVE-15160.08.patch > Can't order by an unselected column > --- > > Key: HIVE-15160 > URL: https://issues.apache.org/jira/browse/HIVE-15160 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch, > HIVE-15160.04.patch, HIVE-15160.05.patch, HIVE-15160.06.patch, > HIVE-15160.07.patch, HIVE-15160.08.patch > > > If a grouping key hasn't been selected, Hive complains. For comparison, > Postgres does not. > Example. Notice i_item_id is not selected: > {code} > select i_item_desc >,i_category >,i_class >,i_current_price >,sum(cs_ext_sales_price) as itemrevenue >,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over >(partition by i_class) as revenueratio > from catalog_sales > ,item > ,date_dim > where cs_item_sk = i_item_sk >and i_category in ('Jewelry', 'Sports', 'Books') >and cs_sold_date_sk = d_date_sk > and d_date between cast('2001-01-12' as date) > and (cast('2001-01-12' as date) + 30 days) > group by i_item_id > ,i_item_desc > ,i_category > ,i_class > ,i_current_price > order by i_category > ,i_class > ,i_item_id > ,i_item_desc > ,revenueratio > limit 100; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900574#comment-15900574 ] Sergey Shelukhin commented on HIVE-16104: - RB https://reviews.apache.org/r/57405/ for review where whitespace changes make the patch too verbose > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900571#comment-15900571 ] Sergey Shelukhin edited comment on HIVE-16104 at 3/8/17 2:03 AM: - Some things that look like refactoring are not actually refactoring. The lock in trySchedule is unnecessary so I removed it and renamed the method; preemption was surrounded by a loop because previously, if the first task in queue was finishable it would bail without preempting anything even if there are more tasks. I can merge updateQueueMetric back into being copy-pasted in 3 places... also one if was refactored because it has lots of repetitive code. Another method was added because something that was previously called in one place is now called in 2 places and I didn't want to copy-paste it. was (Author: sershe): Some things that look like refactoring are not actually refactoring. The lock in trySchedule is unnecessary so I removed it and renamed the method; preemption was surrounded by a loop because previously, if the first task in queue was finishable it would bail without preempting anything even if there are more tasks. I can merge updateQueueMetric back into being copy-pasted in 3 places... also one if was refactored because it has lots of repetitive code. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900571#comment-15900571 ] Sergey Shelukhin commented on HIVE-16104: - Some things that look like refactoring are not actually refactoring. The lock in trySchedule is unnecessary so I removed it and renamed the method; preemption was surrounded by a loop because previously, if the first task in queue was finishable it would bail without preempting anything even if there are more tasks. I can merge updateQueueMetric back into being copy-pasted in 3 places... also one if was refactored because it has lots of repetitive code. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900560#comment-15900560 ] Siddharth Seth edited comment on HIVE-16104 at 3/8/17 1:56 AM: --- Looking. Can you please remove the unnecessary parts of the patch - formatting changes, refactored sections, refactored if/else statements. That makes it difficult to review, and is not required. More than half the patch seems like a refactor. was (Author: sseth): Looking. Can you please remove the unnecessary parts of the patch - formatting changes, refactored if/else statements. That makes it difficult to review, and is not required. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16104) LLAP: preemption may be too aggressive if the pre-empted task doesn't die immediately
[ https://issues.apache.org/jira/browse/HIVE-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900560#comment-15900560 ] Siddharth Seth commented on HIVE-16104: --- Looking. Can you please remove the unnecessary parts of the patch - formatting changes, refactored if/else statements. That makes it difficult to review, and is not required. > LLAP: preemption may be too aggressive if the pre-empted task doesn't die > immediately > - > > Key: HIVE-16104 > URL: https://issues.apache.org/jira/browse/HIVE-16104 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16104.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16123) Let user pick the granularity of bucketing and max in row memory
[ https://issues.apache.org/jira/browse/HIVE-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900543#comment-15900543 ] Hive QA commented on HIVE-16123: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/1285/HIVE-16123.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10330 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4006/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4006/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4006/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 1285 - PreCommit-HIVE-Build > Let user pick the granularity of bucketing and max in row memory > > > Key: HIVE-16123 > URL: https://issues.apache.org/jira/browse/HIVE-16123 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16123.2.patch, HIVE-16123.patch > > > Currently we index the data with granularity of none which puts lot of > pressure on the indexer. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16098) Describe table doesn't show stats for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-16098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16098: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. > Describe table doesn't show stats for partitioned tables > > > Key: HIVE-16098 > URL: https://issues.apache.org/jira/browse/HIVE-16098 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.2.0 > > Attachments: HIVE-16098.1.patch, HIVE-16098.2.patch, > HIVE-16098.3.patch, HIVE-16098.4.patch, HIVE-16098.5.patch, HIVE-16098.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16064) Allow ALL set quantifier with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-16064: -- Labels: TODOC2.2 (was: ) > Allow ALL set quantifier with aggregate functions > - > > Key: HIVE-16064 > URL: https://issues.apache.org/jira/browse/HIVE-16064 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch > > > SQL:2011 allows ALL with aggregate functions which is > equivalent to aggregate function without ALL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16102) Grouping sets do not conform to SQL standard
[ https://issues.apache.org/jira/browse/HIVE-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900520#comment-15900520 ] Lefty Leverenz commented on HIVE-16102: --- Does the wiki need to be updated? If so, please add a TODOC2.2 label. * [GroupBy -- Grouping Sets, Cubes, Rollups, and the GROUPING__ID Function | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy#LanguageManualGroupBy-GroupingSets,Cubes,Rollups,andtheGROUPING__IDFunction] * [Enhanced Aggregation, Cube, Grouping and Rollup | https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup] > Grouping sets do not conform to SQL standard > > > Key: HIVE-16102 > URL: https://issues.apache.org/jira/browse/HIVE-16102 > Project: Hive > Issue Type: Bug > Components: Operators, Parser >Affects Versions: 1.3.0, 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-16102.01.patch, HIVE-16102.02.patch, > HIVE-16102.patch > > > [~ashutoshc] realized that the implementation of GROUPING__ID in Hive was not > returning values as specified by SQL standard and other execution engines. > After digging into this, I found out that the implementation was bogus, as > internally it was changing between big-endian/little-endian representation of > GROUPING__ID indistinctly, and in some cases conversions in both directions > were cancelling each other. > In the documentation in > https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup > we can already find the problem, even if we did not spot it at first. > {quote} > The following query: SELECT key, value, GROUPING__ID, count(\*) from T1 GROUP > BY key, value WITH ROLLUP > will have the following results. > | NULL | NULL | 0 | 6 | > | 1 | NULL | 1 | 2 | > | 1 | NULL | 3 | 1 | > | 1 | 1 | 3 | 1 | > ... > {quote} > Observe that value for GROUPING__ID in first row should be `3`, while for > third and fourth rows, it should be `0`. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15929) Fix HiveDecimalWritable to be compatible with Hive 2.1
[ https://issues.apache.org/jira/browse/HIVE-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900500#comment-15900500 ] Lefty Leverenz commented on HIVE-15929: --- Status nudge: This was committed to master on Feb. 16 with commits 74c50452c5c644a3898bce2738ee040e625caa01, a9c429e637cf366b90a87cc5c1f3c2b4e60ae0c8, and e732aa27efec014302af41fb77c0b1c5197c4b90. [~owen.omalley], please update the status and fix version. > Fix HiveDecimalWritable to be compatible with Hive 2.1 > -- > > Key: HIVE-15929 > URL: https://issues.apache.org/jira/browse/HIVE-15929 > Project: Hive > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-15929.patch > > > HIVE-15335 broke compatibility with Hive 2.1 by making > HiveDecimalWritable.getInternalStorate() throw an exception when called on an > unset value. It is easy to instead return an empty array, which will allow > the old code to allocate a new array. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-12274) Increase width of columns used for general configuration in the metastore.
[ https://issues.apache.org/jira/browse/HIVE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900499#comment-15900499 ] TAK LON WU commented on HIVE-12274: --- +1 and any update on this? > Increase width of columns used for general configuration in the metastore. > -- > > Key: HIVE-12274 > URL: https://issues.apache.org/jira/browse/HIVE-12274 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.0.0 >Reporter: Elliot West >Assignee: Naveen Gangam > Labels: metastore > Attachments: HIVE-12274.2.patch, HIVE-12274.2.patch, > HIVE-12274.example.ddl.hql, HIVE-12274.patch, HIVE-12274.patch, > HIVE-12274.patch > > > h2. Overview > This issue is very similar in principle to HIVE-1364. We are hitting a limit > when processing JSON data that has a large nested schema. The struct > definition is truncated when inserted into the metastore database column > {{COLUMNS_V2.YPE_NAME}} as it is greater than 4000 characters in length. > Given that the purpose of these columns is to hold very loosely defined > configuration values it seems rather limiting to impose such a relatively low > length bound. One can imagine that valid use cases will arise where > reasonable parameter/property values exceed the current limit. > h2. Context > These limitations were in by the [patch > attributed|https://github.com/apache/hive/commit/c21a526b0a752df2a51d20a2729cc8493c228799] > to HIVE-1364 which mentions the _"max length on Oracle 9i/10g/11g"_ as the > reason. However, nowadays the limit can be increased because: > * Oracle DB's {{varchar2}} supports 32767 bytes now, by setting the > configuration parameter {{MAX_STRING_SIZE}} to {{EXTENDED}}. > ([source|http://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF55623]) > * Postgres supports a max of 1GB for {{character}} datatype. > ([source|http://www.postgresql.org/docs/8.3/static/datatype-character.html]) > * MySQL can support upto 65535 bytes for the entire row. So long as the > {{PARAM_KEY}} value + {{PARAM_VALUE}} is less than 65535, we should be good. > ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * SQL Server's {{varchar}} max length is 8000 and can go beyond using > "varchar(max)" with the same limitation as MySQL being 65535 bytes for the > entire row. ([source|http://dev.mysql.com/doc/refman/5.0/en/char.html]) > * Derby's {{varchar}} can be upto 32672 bytes. > ([source|https://db.apache.org/derby/docs/10.7/ref/rrefsqlj41207.html]) > h2. Proposal > Can these columns not use CLOB-like types as for example as used by > {{TBLS.VIEW_EXPANDED_TEXT}}? It would seem that suitable type equivalents > exist for all targeted database platforms: > * MySQL: {{mediumtext}} > * Postgres: {{text}} > * Oracle: {{CLOB}} > * Derby: {{LONG VARCHAR}} > I'd suggest that the candidates for type change are: > * {{COLUMNS_V2.TYPE_NAME}} > * {{TABLE_PARAMS.PARAM_VALUE}} > * {{SERDE_PARAMS.PARAM_VALUE}} > * {{SD_PARAMS.PARAM_VALUE}} > After updating the maximum length the metastore database needs to be > configured and restarted with the new settings. Altering {{MAX_STRING_SIZE}} > will update database objects and possibly invalidate them, as follows: > * Tables with virtual columns will be updated with new data type metadata for > virtual columns of {{VARCHAR2(4000)}}, 4000-byte {{NVARCHAR2}}, or > {{RAW(2000)}} type. > * Functional indexes will become unusable if a change to their associated > virtual columns causes the index key to exceed index key length limits. > Attempts to rebuild such indexes will fail with {{ORA-01450: maximum key > length exceeded}}. > * Views will be invalidated if they contain {{VARCHAR2(4000)}}, 4000-byte > {{NVARCHAR2}}, or {{RAW(2000)}} typed expression columns. > * Materialized views will be updated with new metadata {{VARCHAR2(4000)}}, > 4000-byte {{NVARCHAR2}}, and {{RAW(2000)}} typed expression columns > * So the limitation could be raised to 32672 bytes, with the caveat that > MySQL and SQL Server limit the row length to 65535 bytes, so that should also > be validated to provide consistency. > Finally, will this limitation persist in the work resulting from HIVE-9452? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16124) Drop the segments data as soon it is pushed to HDFS
[ https://issues.apache.org/jira/browse/HIVE-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900468#comment-15900468 ] Hive QA commented on HIVE-16124: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856653/16124.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10330 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4005/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4005/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4005/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856653 - PreCommit-HIVE-Build > Drop the segments data as soon it is pushed to HDFS > --- > > Key: HIVE-16124 > URL: https://issues.apache.org/jira/browse/HIVE-16124 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra >Assignee: slim bouguerra > Attachments: 16124.patch > > > Drop the pushed segments from the indexer as soon as the HDFS push is done. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16143) Improve msck repair batching
[ https://issues.apache.org/jira/browse/HIVE-16143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-16143: -- > Improve msck repair batching > > > Key: HIVE-16143 > URL: https://issues.apache.org/jira/browse/HIVE-16143 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > Currently, the {{msck repair table}} command batches the number of partitions > created in the metastore using the config {{HIVE_MSCK_REPAIR_BATCH_SIZE}}. > Following snippet shows the batching logic. There can be couple of > improvements to this batching logic: > {noformat} > int batch_size = conf.getIntVar(ConfVars.HIVE_MSCK_REPAIR_BATCH_SIZE); > if (batch_size > 0 && partsNotInMs.size() > batch_size) { > int counter = 0; > for (CheckResult.PartitionResult part : partsNotInMs) { > counter++; > > apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null); > repairOutput.add("Repair: Added partition to metastore " + > msckDesc.getTableName() > + ':' + part.getPartitionName()); > if (counter % batch_size == 0 || counter == > partsNotInMs.size()) { > db.createPartitions(apd); > apd = new AddPartitionDesc(table.getDbName(), > table.getTableName(), false); > } > } > } else { > for (CheckResult.PartitionResult part : partsNotInMs) { > > apd.addPartition(Warehouse.makeSpecFromName(part.getPartitionName()), null); > repairOutput.add("Repair: Added partition to metastore " + > msckDesc.getTableName() > + ':' + part.getPartitionName()); > } > db.createPartitions(apd); > } > } catch (Exception e) { > LOG.info("Could not bulk-add partitions to metastore; trying one by > one", e); > repairOutput.clear(); > msckAddPartitionsOneByOne(db, table, partsNotInMs, repairOutput); > } > {noformat} > 1. If the batch size is too aggressive the code falls back to adding > partitions one by one which is almost always very slow. It is easily possible > that users increase the batch size to higher value to make the command run > faster but end up with a worse performance because code falls back to adding > one by one. Users are then expected to determine the tuned value of batch > size which works well for their environment. I think the code could handle > this situation better by exponentially decaying the batch size instead of > falling back to one by one. > 2. The other issue with this implementation is if lets say first batch > succeeds and the second one fails, the code tries to add all the partitions > one by one irrespective of whether some of the were successfully added or > not. If we need to fall back to one by one we should atleast remove the ones > which we know for sure are already added successfully. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16110) Vectorization: Support 2 Value CASE WHEN instead of fall back to VectorUDFAdaptor
[ https://issues.apache.org/jira/browse/HIVE-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900422#comment-15900422 ] Sergey Shelukhin commented on HIVE-16110: - +1 however the test has failed (case) > Vectorization: Support 2 Value CASE WHEN instead of fall back to > VectorUDFAdaptor > - > > Key: HIVE-16110 > URL: https://issues.apache.org/jira/browse/HIVE-16110 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16110.01.patch, HIVE-16110.02.patch, > HIVE-16110.03.patch > > > Vectorize more queries by converting a GenericUDFWhen that has 2 values that > are either a column or a constant into a GenericUDFIf, which has vectorized > classes. This eliminates one case so to speak where we use the > VectorUDFAdaptor. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16078) improve abort checking in Tez/LLAP
[ https://issues.apache.org/jira/browse/HIVE-16078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900401#comment-15900401 ] Sergey Shelukhin commented on HIVE-16078: - It completes for me with PPD disabled (PPD causes massive slowdown due to ORC-148). However, even with PPD enabled I cannot repro the original condition > improve abort checking in Tez/LLAP > -- > > Key: HIVE-16078 > URL: https://issues.apache.org/jira/browse/HIVE-16078 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16078.01.patch, HIVE-16078.02.patch, > HIVE-16078.03.patch, HIVE-16078.patch > > > Sometimes, a fragment can run for a long time after a query fails. It looks > from logs like the abort/interrupt were called correctly on the thread, yet > the thread hangs around minutes after, doing the below. Other tasks for the > same job appear to have exited correctly, after the same abort logic (at > least, the same log lines, fwiw) > {noformat} > at > org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByValue(VectorCopyRow.java:317) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:263) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:277) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:189) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:389) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:628) > at >
[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928
[ https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900390#comment-15900390 ] Hive QA commented on HIVE-16122: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10331 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4004/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4004/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4004/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build > NPE Hive Druid split introduced by HIVE-15928 > - > > Key: HIVE-16122 > URL: https://issues.apache.org/jira/browse/HIVE-16122 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, > HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16098) Describe table doesn't show stats for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-16098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900391#comment-15900391 ] Pengcheng Xiong commented on HIVE-16098: patch LGTM +1. I think we can open new jiras for the follow-up work. > Describe table doesn't show stats for partitioned tables > > > Key: HIVE-16098 > URL: https://issues.apache.org/jira/browse/HIVE-16098 > Project: Hive > Issue Type: Improvement > Components: Diagnosability >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-16098.1.patch, HIVE-16098.2.patch, > HIVE-16098.3.patch, HIVE-16098.4.patch, HIVE-16098.5.patch, HIVE-16098.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900364#comment-15900364 ] Pengcheng Xiong edited comment on HIVE-11266 at 3/7/17 11:13 PM: - Hello there, which version of hive are you using? I saw you put a label 1.1.0 as affected versions. Does that mean that you are using Hive 1.1? Thanks. was (Author: pxiong): Hello there, which version of hive are you using? I saw you put a lable 1.1.0 as affected versions. Does that mean that you are using Hive 1.1? Thanks. > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16131) Hive building with Hadoop 3 - additional stuff broken recently
[ https://issues.apache.org/jira/browse/HIVE-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900365#comment-15900365 ] Wei Zheng commented on HIVE-16131: -- OK I see. +1 pending test > Hive building with Hadoop 3 - additional stuff broken recently > -- > > Key: HIVE-16131 > URL: https://issues.apache.org/jira/browse/HIVE-16131 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16131.01.patch, HIVE-16131.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900364#comment-15900364 ] Pengcheng Xiong commented on HIVE-11266: Hello there, which version of hive are you using? I saw you put a lable 1.1.0 as affected versions. Does that mean that you are using Hive 1.1? Thanks. > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-11266) count(*) wrong result based on table statistics
[ https://issues.apache.org/jira/browse/HIVE-11266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-11266: -- Assignee: Pengcheng Xiong > count(*) wrong result based on table statistics > --- > > Key: HIVE-11266 > URL: https://issues.apache.org/jira/browse/HIVE-11266 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Simone Battaglia >Assignee: Pengcheng Xiong >Priority: Critical > > Hive returns wrong count result on an external table with table statistics if > I change table data files. > This is the scenario in details: > 1) create external table my_table (...) location 'my_location'; > 2) analyze table my_table compute statistics; > 3) change/add/delete one or more files in 'my_location' directory; > 4) select count(\*) from my_table; > In this case the count query doesn't generate a MR job and returns the result > based on table statistics. This result is wrong because is based on > statistics stored in the Hive metastore and doesn't take into account > modifications introduced on data files. > Obviously setting "hive.compute.query.using.stats" to FALSE this problem > doesn't occur but the default value of this property is TRUE. > I thinks that also this post on stackoverflow, that shows another type of bug > in case of multiple insert, is related to the one that I reported: > http://stackoverflow.com/questions/24080276/wrong-result-for-count-in-hive-table -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16131) Hive building with Hadoop 3 - additional stuff broken recently
[ https://issues.apache.org/jira/browse/HIVE-16131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16131: Attachment: HIVE-16131.01.patch [~wzheng] that doesn't compile with hadoop2. Fixed for both for now. > Hive building with Hadoop 3 - additional stuff broken recently > -- > > Key: HIVE-16131 > URL: https://issues.apache.org/jira/browse/HIVE-16131 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16131.01.patch, HIVE-16131.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15282) Different modification times are used when an index is built and when its staleness is checked
[ https://issues.apache.org/jira/browse/HIVE-15282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-15282: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Marta! > Different modification times are used when an index is built and when its > staleness is checked > -- > > Key: HIVE-15282 > URL: https://issues.apache.org/jira/browse/HIVE-15282 > Project: Hive > Issue Type: Bug > Components: Indexing >Affects Versions: 2.2.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Fix For: 2.2.0 > > Attachments: HIVE-15282.2.patch, HIVE-15282.patch > > > The index_auto_mult_tables and index_auto_mult_tables_compact q tests are > failing from time to time with the following error: > {noformat} > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables > Failing for the past 1 build (Since Failed#16 ) > Took 16 sec. > Error Message > Unexpected exception junit.framework.AssertionFailedError: Client Execution > results failed with error code = 1 > See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, > or check ./ql/target/surefire-reports or > ./itests/qtest/target/surefire-reports/ for specific test cases logs. > at junit.framework.Assert.fail(Assert.java:57) > at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2001) > at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:194) > at > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables(TestCliDriver.java:142) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) > at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) > See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, > or check ./ql/target/surefire-reports or > ./itests/qtest/target/surefire-reports/ for specific test cases logs. > Stacktrace > junit.framework.AssertionFailedError: Unexpected exception > junit.framework.AssertionFailedError: Client Execution results failed with > error code = 1 > See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, > or check ./ql/target/surefire-reports or > ./itests/qtest/target/surefire-reports/ for specific test cases logs. > at junit.framework.Assert.fail(Assert.java:57) > at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2001) > at > org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:194) > at > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables(TestCliDriver.java:142) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Commented] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc
[ https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900323#comment-15900323 ] Matt McCline commented on HIVE-15857: - [~sershe] Thank you for your review!! > Vectorization: Add string conversion case for UDFToInteger, etc > --- > > Key: HIVE-15857 > URL: https://issues.apache.org/jira/browse/HIVE-15857 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, > HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch > > > Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc
[ https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15857: Status: Patch Available (was: In Progress) > Vectorization: Add string conversion case for UDFToInteger, etc > --- > > Key: HIVE-15857 > URL: https://issues.apache.org/jira/browse/HIVE-15857 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, > HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch > > > Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc
[ https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15857: Attachment: HIVE-15857.05.patch > Vectorization: Add string conversion case for UDFToInteger, etc > --- > > Key: HIVE-15857 > URL: https://issues.apache.org/jira/browse/HIVE-15857 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, > HIVE-15857.03.patch, HIVE-15857.04.patch, HIVE-15857.05.patch > > > Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc
[ https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15857: Status: In Progress (was: Patch Available) > Vectorization: Add string conversion case for UDFToInteger, etc > --- > > Key: HIVE-15857 > URL: https://issues.apache.org/jira/browse/HIVE-15857 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, > HIVE-15857.03.patch, HIVE-15857.04.patch > > > Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928
[ https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900312#comment-15900312 ] Hive QA commented on HIVE-16122: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10330 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) org.apache.hive.service.server.TestHS2HttpServer.testContextRootUrlRewrite (batchId=186) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4003/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4003/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4003/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build > NPE Hive Druid split introduced by HIVE-15928 > - > > Key: HIVE-16122 > URL: https://issues.apache.org/jira/browse/HIVE-16122 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, > HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900313#comment-15900313 ] Ashutosh Chauhan commented on HIVE-16142: - +1 > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16142.01.patch > > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list
[ https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-16141. -- Resolution: Invalid Closing it as invalid. > SQL auth whitelist configs should have a config for appending to existing list > -- > > Key: HIVE-16141 > URL: https://issues.apache.org/jira/browse/HIVE-16141 > Project: Hive > Issue Type: Bug > Components: SQLStandardAuthorization >Affects Versions: 1.3.0, 2.2.0, 2.1.1 >Reporter: Prasanth Jayachandran > > Sql auth whitelist set configs can be added to > hive.security.authorization.sqlstd.confwhitelist but this will replace the > default white list patterns. If users want the default white list configs and > would like to append to it, then this can get complicated. We can have a > separate config that will let users to append to default whitelist. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list
[ https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900288#comment-15900288 ] Prasanth Jayachandran commented on HIVE-16141: -- This is probably invalid. I can see that append being applied in SettableConfigUpdater. > SQL auth whitelist configs should have a config for appending to existing list > -- > > Key: HIVE-16141 > URL: https://issues.apache.org/jira/browse/HIVE-16141 > Project: Hive > Issue Type: Bug > Components: SQLStandardAuthorization >Affects Versions: 1.3.0, 2.2.0, 2.1.1 >Reporter: Prasanth Jayachandran > > Sql auth whitelist set configs can be added to > hive.security.authorization.sqlstd.confwhitelist but this will replace the > default white list patterns. If users want the default white list configs and > would like to append to it, then this can get complicated. We can have a > separate config that will let users to append to default whitelist. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900281#comment-15900281 ] Pengcheng Xiong commented on HIVE-16142: [~ashutoshc], could u take a look? Thanks. > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16142.01.patch > > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16142: --- Attachment: HIVE-16142.01.patch > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16142.01.patch > > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16142: --- Status: Patch Available (was: Open) > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16142.01.patch > > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16142) ATSHook NPE via LLAP
[ https://issues.apache.org/jira/browse/HIVE-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-16142: -- > ATSHook NPE via LLAP > > > Key: HIVE-16142 > URL: https://issues.apache.org/jira/browse/HIVE-16142 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > > Exceptions in the log of the form: > 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook > (ATSHook.java:run(318)) - Failed to submit to ATS for > hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) > ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list
[ https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900279#comment-15900279 ] Prasanth Jayachandran commented on HIVE-16141: -- Just noticed, there is actually a config to append hive.security.authorization.sqlstd.confwhitelist.append but this is not used in HiveConf by default. > SQL auth whitelist configs should have a config for appending to existing list > -- > > Key: HIVE-16141 > URL: https://issues.apache.org/jira/browse/HIVE-16141 > Project: Hive > Issue Type: Bug > Components: SQLStandardAuthorization >Affects Versions: 1.3.0, 2.2.0, 2.1.1 >Reporter: Prasanth Jayachandran > > Sql auth whitelist set configs can be added to > hive.security.authorization.sqlstd.confwhitelist but this will replace the > default white list patterns. If users want the default white list configs and > would like to append to it, then this can get complicated. We can have a > separate config that will let users to append to default whitelist. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16141) SQL auth whitelist configs should have a config for appending to existing list
[ https://issues.apache.org/jira/browse/HIVE-16141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900274#comment-15900274 ] Prasanth Jayachandran commented on HIVE-16141: -- cc\ [~thejas] [~sushanth] > SQL auth whitelist configs should have a config for appending to existing list > -- > > Key: HIVE-16141 > URL: https://issues.apache.org/jira/browse/HIVE-16141 > Project: Hive > Issue Type: Bug > Components: SQLStandardAuthorization >Affects Versions: 1.3.0, 2.2.0, 2.1.1 >Reporter: Prasanth Jayachandran > > Sql auth whitelist set configs can be added to > hive.security.authorization.sqlstd.confwhitelist but this will replace the > default white list patterns. If users want the default white list configs and > would like to append to it, then this can get complicated. We can have a > separate config that will let users to append to default whitelist. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16064) Allow ALL set quantifier with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-16064: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Vineet! > Allow ALL set quantifier with aggregate functions > - > > Key: HIVE-16064 > URL: https://issues.apache.org/jira/browse/HIVE-16064 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 2.2.0 > > Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch > > > SQL:2011 allows ALL with aggregate functions which is > equivalent to aggregate function without ALL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16140) Stabilize few randomly failing tests
[ https://issues.apache.org/jira/browse/HIVE-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-16140: --- > Stabilize few randomly failing tests > > > Key: HIVE-16140 > URL: https://issues.apache.org/jira/browse/HIVE-16140 > Project: Hive > Issue Type: Test > Components: Testing Infrastructure, Tests >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > > Golden file update for vector_between_in test and sort_before_diff for couple > of Perf tests. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16090) Addendum to HIVE-16014
[ https://issues.apache.org/jira/browse/HIVE-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-16090: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Thanks [~vihangk1]. I committed this to master. > Addendum to HIVE-16014 > -- > > Key: HIVE-16090 > URL: https://issues.apache.org/jira/browse/HIVE-16090 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16090.01.patch, HIVE-16090.02.patch, > HIVE-16090.03.patch > > > HIVE-16014 changed the HiveMetastoreChecker to use > {{METASTORE_FS_HANDLER_THREADS_COUNT}} for pool size. Some of the tests in > TestHiveMetastoreChecker still use {{HIVE_MOVE_FILES_THREAD_COUNT}} which > leads to incorrect test behavior. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16064) Allow ALL set quantifier with aggregate functions
[ https://issues.apache.org/jira/browse/HIVE-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900211#comment-15900211 ] Ashutosh Chauhan commented on HIVE-16064: - +1 > Allow ALL set quantifier with aggregate functions > - > > Key: HIVE-16064 > URL: https://issues.apache.org/jira/browse/HIVE-16064 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16064.1.patch, HIVE-16064.2.patch > > > SQL:2011 allows ALL with aggregate functions which is > equivalent to aggregate function without ALL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928
[ https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900206#comment-15900206 ] Hive QA commented on HIVE-16122: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10330 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=224) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=224) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) org.apache.hive.hcatalog.pig.TestRCFileHCatStorer.testWriteDecimalXY (batchId=173) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteSmallint (batchId=173) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4002/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4002/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4002/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build > NPE Hive Druid split introduced by HIVE-15928 > - > > Key: HIVE-16122 > URL: https://issues.apache.org/jira/browse/HIVE-16122 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, > HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16054) AMReporter should use application token instead of ugi.getCurrentUser
[ https://issues.apache.org/jira/browse/HIVE-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16054: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master > AMReporter should use application token instead of ugi.getCurrentUser > - > > Key: HIVE-16054 > URL: https://issues.apache.org/jira/browse/HIVE-16054 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Rajesh Balamohan >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-16054.1.patch, HIVE-16054.1.patch > > > During the initial creation of the ugi we user appId but later we user the > user who submitted the request. Although this doesn't matter as long as the > job tokens are set correctly. It is good to keep it consistent. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16136) LLAP: Before SIGKILL, collect diagnostic information before daemon goes down
[ https://issues.apache.org/jira/browse/HIVE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16136: - Summary: LLAP: Before SIGKILL, collect diagnostic information before daemon goes down (was: LLAP: Before SIGKILL and collect diagnostic information before daemon goes down) > LLAP: Before SIGKILL, collect diagnostic information before daemon goes down > > > Key: HIVE-16136 > URL: https://issues.apache.org/jira/browse/HIVE-16136 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Sometime daemons can get killed by YARN's pmem monitor which issue a kill > followed by kill -9 after 250ms. This is really a short duration to collect > anything useful. > There is no clean way to trap SIGKILL in java. > One option is to increase the time between kill and kill -9 in YARN and > during that time we can have a shutdown hook handler to collect all > diagnostics information like heapdump, jstack, jmx output etc. in a > non-container directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc
[ https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900148#comment-15900148 ] Sergey Shelukhin commented on HIVE-15857: - +1 w/one small comment. There is some repetitive code that is handled less repetitively elsewhere in vectorization iirc (e.g. selectedInUse where it does int index = sIU ? sel[i] : i, instead of having 2 separate loops). I am assuming this is due to vectorization requirements that they are all handled separately :) > Vectorization: Add string conversion case for UDFToInteger, etc > --- > > Key: HIVE-15857 > URL: https://issues.apache.org/jira/browse/HIVE-15857 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch, > HIVE-15857.03.patch, HIVE-15857.04.patch > > > Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, > etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16090) Addendum to HIVE-16014
[ https://issues.apache.org/jira/browse/HIVE-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900124#comment-15900124 ] Sergio Peña commented on HIVE-16090: Looks good. +1 > Addendum to HIVE-16014 > -- > > Key: HIVE-16090 > URL: https://issues.apache.org/jira/browse/HIVE-16090 > Project: Hive > Issue Type: Task > Components: Hive >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16090.01.patch, HIVE-16090.02.patch, > HIVE-16090.03.patch > > > HIVE-16014 changed the HiveMetastoreChecker to use > {{METASTORE_FS_HANDLER_THREADS_COUNT}} for pool size. Some of the tests in > TestHiveMetastoreChecker still use {{HIVE_MOVE_FILES_THREAD_COUNT}} which > leads to incorrect test behavior. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16076) LLAP packaging - include aux libs
[ https://issues.apache.org/jira/browse/HIVE-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16076: Attachment: HIVE-16076.02.patch Updated the patch. > LLAP packaging - include aux libs > -- > > Key: HIVE-16076 > URL: https://issues.apache.org/jira/browse/HIVE-16076 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Sergey Shelukhin > Attachments: HIVE-16076.01.patch, HIVE-16076.02.patch, > HIVE-16076.patch > > > The old auxlibs (or whatever) should be packaged by default, if present. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16114) NullPointerException in TezSessionPoolManager when getting the session
[ https://issues.apache.org/jira/browse/HIVE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900128#comment-15900128 ] Sergey Shelukhin commented on HIVE-16114: - testGetNonDefaultSession failure may be related > NullPointerException in TezSessionPoolManager when getting the session > -- > > Key: HIVE-16114 > URL: https://issues.apache.org/jira/browse/HIVE-16114 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Attachments: HIVE-16114.1.patch, HIVE-16114.patch > > > hive version: apache-hive-2.1.1 > we use hue(3.11.0) connecting to the HiveServer2. when hue starts up, it > works with no problems, a few hours passed, when we use the same sql, an > exception about unable to execute TezTask will come into being. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16076) LLAP packaging - include aux libs
[ https://issues.apache.org/jira/browse/HIVE-16076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900106#comment-15900106 ] Sergey Shelukhin commented on HIVE-16076: - [~prasanth_j] ADDEDJARS is actually set via Utilities.getResourceFiles(conf, SessionState.ResourceType.JAR). Fixing the rest > LLAP packaging - include aux libs > -- > > Key: HIVE-16076 > URL: https://issues.apache.org/jira/browse/HIVE-16076 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Sergey Shelukhin > Attachments: HIVE-16076.01.patch, HIVE-16076.patch > > > The old auxlibs (or whatever) should be packaged by default, if present. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15515) Remove the docs directory
[ https://issues.apache.org/jira/browse/HIVE-15515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900103#comment-15900103 ] Hive QA commented on HIVE-15515: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856646/HIVE-15515.01.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4001/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4001/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4001/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-03-07 20:31:46.522 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-4001/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-03-07 20:31:46.524 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 9368fec HIVE-15920 Implement a blocking version of a command to compact (Eugene Koifman, reviewed by Wei Zheng) + git clean -f -d Removing druid-handler/src/test/org/apache/hadoop/hive/druid/io/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 9368fec HIVE-15920 Implement a blocking version of a command to compact (Eugene Koifman, reviewed by Wei Zheng) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-03-07 20:31:47.603 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch fatal: git diff header lacks filename information when removing 0 leading pathname components (line 523) The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12856646 - PreCommit-HIVE-Build > Remove the docs directory > - > > Key: HIVE-15515 > URL: https://issues.apache.org/jira/browse/HIVE-15515 > Project: Hive > Issue Type: Bug > Components: Documentation >Reporter: Lefty Leverenz >Assignee: Akira Ajisaka > Attachments: HIVE-15515.01.patch > > > Hive xdocs have not been used since 2012. The docs directory only holds six > xml documents, and their contents are in the wiki. > It's past time to remove the docs directory from the Hive code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16122) NPE Hive Druid split introduced by HIVE-15928
[ https://issues.apache.org/jira/browse/HIVE-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900096#comment-15900096 ] Hive QA commented on HIVE-16122: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12856652/HIVE-16112.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10330 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_table] (batchId=147) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4000/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4000/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4000/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12856652 - PreCommit-HIVE-Build > NPE Hive Druid split introduced by HIVE-15928 > - > > Key: HIVE-16122 > URL: https://issues.apache.org/jira/browse/HIVE-16122 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra > Attachments: HIVE-16112.2.patch, HIVE-16112.3.patch, > HIVE-16112.4.patch, HIVE-16112.5.patch, HIVE-16122.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15001) Remove showConnectedUrl from command line help
[ https://issues.apache.org/jira/browse/HIVE-15001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900054#comment-15900054 ] Naveen Gangam commented on HIVE-15001: -- makes sense to remove the dead code. So +1 for me. > Remove showConnectedUrl from command line help > -- > > Key: HIVE-15001 > URL: https://issues.apache.org/jira/browse/HIVE-15001 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Trivial > Attachments: HIVE-15001.2.patch, HIVE-15001.3.patch, HIVE-15001.patch > > > As discussed with [~nemon], the showConnectedUrl commandline parameter is not > working since a erroneous merge. Instead beeline always prints the currently > connected url. Since it is good for everyone, no extra parameter is needed to > turn this feature on. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16136) LLAP: Before SIGKILL and collect diagnostic information before daemon goes down
[ https://issues.apache.org/jira/browse/HIVE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900051#comment-15900051 ] Prasanth Jayachandran commented on HIVE-16136: -- bq. The shell scripts are probably where we can trap signals and dump /proc//smaps & /proc//stat ? Bash has a "trap" feature for this. Yeah. We could get these as well. But I think this can only be triggered under OOM on error hook or other jvm fatal error (although I was not able to make this work with OnError hook + stack overflow exception). This won't work for SIGTERM or SIGKILL. bq. This is pretty easy to increase, but is cluster wide config. If cluster wide config, then having shorter intervals will not be enough for full heap dump. We could add separate shutdown hooks to collect jstack, jmx, /proc/* etc. and let HeapDumpOnOOM handle heap dump. We can probably have a web endpoint for manual heapdump if it's useful. > LLAP: Before SIGKILL and collect diagnostic information before daemon goes > down > --- > > Key: HIVE-16136 > URL: https://issues.apache.org/jira/browse/HIVE-16136 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Sometime daemons can get killed by YARN's pmem monitor which issue a kill > followed by kill -9 after 250ms. This is really a short duration to collect > anything useful. > There is no clean way to trap SIGKILL in java. > One option is to increase the time between kill and kill -9 in YARN and > during that time we can have a shutdown hook handler to collect all > diagnostics information like heapdump, jstack, jmx output etc. in a > non-container directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15997) Resource leaks when query is cancelled
[ https://issues.apache.org/jira/browse/HIVE-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900049#comment-15900049 ] Chaoyu Tang commented on HIVE-15997: LGTM, +1 > Resource leaks when query is cancelled > --- > > Key: HIVE-15997 > URL: https://issues.apache.org/jira/browse/HIVE-15997 > Project: Hive > Issue Type: Bug >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-15997.1.patch > > > There may some resource leaks when query is cancelled. > We see following stacks in the log: > Possible files and folder leak: > {noformat} > 2017-02-02 06:23:25,410 WARN hive.ql.Context: [HiveServer2-Background-Pool: > Thread-61]: Error Removing Scratch: java.io.IOException: Failed on local > exception: java.nio.channels.ClosedByInterruptException; Host Details : local > host is: "ychencdh511t-1.vpc.cloudera.com/172.26.11.50"; destination host is: > "ychencdh511t-1.vpc.cloudera.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1476) > at org.apache.hadoop.ipc.Client.call(Client.java:1409) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy25.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy26.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at org.apache.hadoop.hive.ql.Context.removeScratchDir(Context.java:405) > at org.apache.hadoop.hive.ql.Context.clear(Context.java:541) > at org.apache.hadoop.hive.ql.Driver.releaseContext(Driver.java:2109) > at org.apache.hadoop.hive.ql.Driver.closeInProcess(Driver.java:2150) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1472) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1212) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:681) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:615) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:714) > at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:376) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) > at