[jira] [Commented] (HIVE-14688) Hive drop call fails in presence of TDE
[ https://issues.apache.org/jira/browse/HIVE-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075976#comment-16075976 ] Wei Zheng commented on HIVE-14688: -- [~thejas] Looks like HIVE-11418 was recently committed to master which is solving the same problem. So this ticket becomes duplicate of HIVE-11418. It won't work for older 2.x hadoop versions, as was also discussed in HIVE-11418. > Hive drop call fails in presence of TDE > --- > > Key: HIVE-14688 > URL: https://issues.apache.org/jira/browse/HIVE-14688 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 1.2.1, 2.0.0 >Reporter: Deepesh Khandelwal >Assignee: Wei Zheng > Attachments: HIVE-14688.1.patch, HIVE-14688.2.patch, > HIVE-14688.3.patch, HIVE-14688.4.patch > > > This should be committed to when Hive moves to Hadoop 2.8 > In Hadoop 2.8.0 TDE trash collection was fixed through HDFS-8831. This > enables us to make drop table calls for Hive managed tables where Hive > metastore warehouse directory is in encrypted zone. However even with the > feature in HDFS, Hive drop table currently fail: > {noformat} > $ hdfs crypto -listZones > /apps/hive/warehouse key2 > $ hdfs dfs -ls /apps/hive/warehouse > Found 1 items > drwxrwxrwt - hdfs hdfs 0 2016-09-01 02:54 > /apps/hive/warehouse/.Trash > hive> create table abc(a string, b int); > OK > Time taken: 5.538 seconds > hive> dfs -ls /apps/hive/warehouse; > Found 2 items > drwxrwxrwt - hdfs hdfs 0 2016-09-01 02:54 > /apps/hive/warehouse/.Trash > drwxrwxrwx - deepesh hdfs 0 2016-09-01 17:15 > /apps/hive/warehouse/abc > hive> drop table if exists abc; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to drop > default.abc because it is in an encryption zone and trash is enabled. Use > PURGE option to skip trash.) > {noformat} > The problem lies here: > {code:title=metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java} > private void checkTrashPurgeCombination(Path pathToData, String objectName, > boolean ifPurge) > ... > if (trashEnabled) { > try { > HadoopShims.HdfsEncryptionShim shim = > > ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf), > hiveConf); > if (shim.isPathEncrypted(pathToData)) { > throw new MetaException("Unable to drop " + objectName + " > because it is in an encryption zone" + > " and trash is enabled. Use PURGE option to skip trash."); > } > } catch (IOException ex) { > MetaException e = new MetaException(ex.getMessage()); > e.initCause(ex); > throw e; > } > } > {code} > As we can see that we are making an assumption that delete wouldn't be > successful in encrypted zone. We need to modify this logic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075963#comment-16075963 ] Hive QA commented on HIVE-16993: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875836/HIVE-17008.8.patch {color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10832 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importAll (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneDb (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneFunc (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneRole (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTableNonPartitioned (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTablePartitioned (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importSecurity (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importTablesWithConstraints (batchId=208) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5903/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5903/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5903/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875836 - PreCommit-HIVE-Build > ThriftHiveMetastore.create_database can fail if the locationUri is not set > -- > > Key: HIVE-16993 > URL: https://issues.apache.org/jira/browse/HIVE-16993 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, > HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, > HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch, HIVE-17008.8.patch > > > Calling > [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078] > with a database with an unset {{locationUri}} field through the C++ > implementation fails with: > {code} > MetaException(message=java.lang.IllegalArgumentException: Can not create a > Path from an empty string) > {code} > The > [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270] > Thrift field is 'default requiredness (implicit)', and Thrift [does not > specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] > whether unset default requiredness fields are encoded. Empirically, the Java > generated code [does not write the > {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942] > when the field is unset, while the C++ generated code > [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890]. > The MetaStore treats the field as optional, and [fills in a default > value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871] > if the field is unset. > The end result is that when the C++ implementation sends a {{Database}} > without the field set, it actually writes an empty string, and the MetaStore > treats it as a set field (non-null), and then calls a {{Path}} API which > rejects the empty string. The fix is simple: make the
[jira] [Commented] (HIVE-10495) Hive index creation code throws NPE if index table is null
[ https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075910#comment-16075910 ] Hive QA commented on HIVE-10495: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12755107/HIVE-10495.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10832 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5902/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5902/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5902/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12755107 - PreCommit-HIVE-Build > Hive index creation code throws NPE if index table is null > -- > > Key: HIVE-10495 > URL: https://issues.apache.org/jira/browse/HIVE-10495 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch > > > The stack trace would be: > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) > at java.lang.reflect.Method.invoke(Method.java:611) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at $Proxy9.add_index(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17018) Small table is converted to map join even the total size of small tables exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)
[ https://issues.apache.org/jira/browse/HIVE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075871#comment-16075871 ] Chao Sun commented on HIVE-17018: - [~kellyzly] I'm not sure if I understand what you described. Can you come up with a small example query that demonstrates the problem? thanks. > Small table is converted to map join even the total size of small tables > exceeds the threshold(hive.auto.convert.join.noconditionaltask.size) > - > > Key: HIVE-17018 > URL: https://issues.apache.org/jira/browse/HIVE-17018 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > > we use "hive.auto.convert.join.noconditionaltask.size" as the threshold. it > means the sum of size for n-1 of the tables/partitions for a n-way join is > smaller than it, it will be converted to a map join. for example, A join B > join C join D join E. Big table is A(100M), small tables are > B(10M),C(10M),D(10M),E(10M). If we set > hive.auto.convert.join.noconditionaltask.size=20M. In current code, E,D,B > will be converted to map join but C will not be converted to map join. In my > understanding, because hive.auto.convert.join.noconditionaltask.size can only > contain E and D, so C and B should not be converted to map join. > Let's explain more why E can be converted to map join. > in current code, > [SparkMapJoinOptimizer#getConnectedMapJoinSize|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L364] > calculates all the mapjoins in the parent path and child path. The search > stops when encountering [UnionOperator or > ReduceOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L381]. > Because C is not converted to map join because {{connectedMapJoinSize + > totalSize) > maxSize}} [see > code|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L330].The > RS before the join of C remains. When calculating whether B will be > converted to map join, {{getConnectedMapJoinSize}} returns 0 as encountering > [RS > |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#409] > and causes {{connectedMapJoinSize + totalSize) < maxSize}} matches. > [~xuefuz] or [~jxiang]: can you help see whether this is a bug or not as you > are more familiar with SparkJoinOptimizer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17049) hive doesn't support chinese comments for columns
[ https://issues.apache.org/jira/browse/HIVE-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liugaopeng updated HIVE-17049: -- Description: 1. alter table stg.test_chinese change chinesetitle chinesetitle tinyint comment '中文'; 2. desc stg.test_chinese; Result: chinese comment "中文" becase "??" also, if i modify the comment via hive view, it will still display the messy code "??". I did some testing, but cannot fix it, such as: 1. change the hive.COLUMNS_V2 to UTF-8 chartset. 2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url i found some ideas that need to apply some patch to fix it, but seems they all effects in 0.x version, i use the 1.2.1 version. Please give some guidance. was: 1. alter table stg.test_chinese change chinesetitle chinesetitle tinyint comment '中文'; 2. desc stg.test_chinese; Result: chinese comment "中文" becase "??" also, if i modify the comment via hive view, it will still display the messy code "??". I did some testing, but cannot fix it, such as: 1. change the hive.COLUMNS_V2 to UTF-8 chartset. 2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url i found some ideas that need to apply some patch to fix it, but seems they all effects in 0.x version, i use the 1.2.1 version. Please give some guidence. > hive doesn't support chinese comments for columns > - > > Key: HIVE-17049 > URL: https://issues.apache.org/jira/browse/HIVE-17049 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1 > Environment: hive 1.2.1 in HDP >Reporter: liugaopeng > > 1. alter table stg.test_chinese change chinesetitle chinesetitle tinyint > comment '中文'; > 2. desc stg.test_chinese; > Result: chinese comment "中文" becase "??" > also, if i modify the comment via hive view, it will still display the messy > code "??". > I did some testing, but cannot fix it, such as: > 1. change the hive.COLUMNS_V2 to UTF-8 chartset. > 2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url > i found some ideas that need to apply some patch to fix it, but seems they > all effects in 0.x version, i use the 1.2.1 version. > Please give some guidance. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075863#comment-16075863 ] Hive QA commented on HIVE-16832: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875831/HIVE-16832.18.patch {color:green}SUCCESS:{color} +1 due to 12 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10847 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5901/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5901/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5901/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875831 - PreCommit-HIVE-Build > duplicate ROW__ID possible in multi insert into transactional table > --- > > Key: HIVE-16832 > URL: https://issues.apache.org/jira/browse/HIVE-16832 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, > HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, > HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, > HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, > HIVE-16832.16.patch, HIVE-16832.17.patch, HIVE-16832.18.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075852#comment-16075852 ] Chao Sun edited comment on HIVE-17010 at 7/6/17 3:27 AM: - Ah I see. Sometimes the stats estimation could generate negative values, in which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size. One case I observed previously: {code} not ((P1 or P2) or P3) {code} When no column stats are available, Hive will simply divide the # of input rows by 2 for each predicate evaluation. Suppose the total input rows is 10, then {{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The operator {{not}}, on the other hand, will subtract the value of its associated expression from the total input rows. Therefore in the end you will get {{10 - 30 = -20}}. For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but either way should be fine. was (Author: csun): Ah I see. Sometimes the stats estimation could generate negative values, in which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size could be. One case I observed previously: {code} not ((P1 or P2) or P3) {code} When no column stats are available, Hive will simply divide the # of input rows by 2 for each predicate evaluation. Suppose the total input rows is 10, then {{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The operator {{not}}, on the other hand, will subtract the value of its associated expression from the total input rows. Therefore in the end you will get {{10 - 30 = -20}}. For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but either way should be fine. > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14688) Hive drop call fails in presence of TDE
[ https://issues.apache.org/jira/browse/HIVE-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075862#comment-16075862 ] Thejas M Nair commented on HIVE-14688: -- [~wzheng] HIVE-16402 has the changes to update hadoop dependency to 2.8.0. What would happen if hive with this change is used against older 2.x hadoop versions ? Hive is still supposed to work against older 2.x versions as well. If it results in another error from hadoop for the user, I think the change is file. cc [~spena] > Hive drop call fails in presence of TDE > --- > > Key: HIVE-14688 > URL: https://issues.apache.org/jira/browse/HIVE-14688 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 1.2.1, 2.0.0 >Reporter: Deepesh Khandelwal >Assignee: Wei Zheng > Attachments: HIVE-14688.1.patch, HIVE-14688.2.patch, > HIVE-14688.3.patch, HIVE-14688.4.patch > > > This should be committed to when Hive moves to Hadoop 2.8 > In Hadoop 2.8.0 TDE trash collection was fixed through HDFS-8831. This > enables us to make drop table calls for Hive managed tables where Hive > metastore warehouse directory is in encrypted zone. However even with the > feature in HDFS, Hive drop table currently fail: > {noformat} > $ hdfs crypto -listZones > /apps/hive/warehouse key2 > $ hdfs dfs -ls /apps/hive/warehouse > Found 1 items > drwxrwxrwt - hdfs hdfs 0 2016-09-01 02:54 > /apps/hive/warehouse/.Trash > hive> create table abc(a string, b int); > OK > Time taken: 5.538 seconds > hive> dfs -ls /apps/hive/warehouse; > Found 2 items > drwxrwxrwt - hdfs hdfs 0 2016-09-01 02:54 > /apps/hive/warehouse/.Trash > drwxrwxrwx - deepesh hdfs 0 2016-09-01 17:15 > /apps/hive/warehouse/abc > hive> drop table if exists abc; > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to drop > default.abc because it is in an encryption zone and trash is enabled. Use > PURGE option to skip trash.) > {noformat} > The problem lies here: > {code:title=metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java} > private void checkTrashPurgeCombination(Path pathToData, String objectName, > boolean ifPurge) > ... > if (trashEnabled) { > try { > HadoopShims.HdfsEncryptionShim shim = > > ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf), > hiveConf); > if (shim.isPathEncrypted(pathToData)) { > throw new MetaException("Unable to drop " + objectName + " > because it is in an encryption zone" + > " and trash is enabled. Use PURGE option to skip trash."); > } > } catch (IOException ex) { > MetaException e = new MetaException(ex.getMessage()); > e.initCause(ex); > throw e; > } > } > {code} > As we can see that we are making an assumption that delete wouldn't be > successful in encrypted zone. We need to modify this logic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075852#comment-16075852 ] Chao Sun commented on HIVE-17010: - Ah I see. Sometimes the stats estimation could generate negative values, in which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size could be. One case I observed previously: {code} not ((P1 or P2) or P3) {code} When no column stats are available, Hive will simply divide the # of input rows by 2 for each predicate evaluation. Suppose the total input rows is 10, then {{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The operator {{not}}, on the other hand, will subtract the value of its associated expression from the total input rows. Therefore in the end you will get {{10 - 30 = -20}}. For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but either way should be fine. > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075839#comment-16075839 ] liyunzhang_intel commented on HIVE-17010: - [~csun]: the explain of the query17 without HIVE-17010.patch is in [link|https://issues.apache.org/jira/secure/attachment/12875204/query17_explain.log]. Reduce3's datasize is 9223372036854775807 {code} Reducer 3 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: bigint), _col27 (type: bigint) 1 cs_bill_customer_sk (type: bigint), cs_item_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: _col22 (type: bigint) sort order: + Map-reduce partition columns: _col22 (type: bigint) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL value expressions: _col1 (type: bigint), _col2 (type: bigint), _col6 (type: bigint), _col8 (type: bigint), _col9 (type: int), _col27 (type: bigint), _col28 (type: bigint), _col34 (type: bigint), _col35 (type: int), _col45 (type: bigint), _col51 (type: bigint), _col63 (type: bigint), _col66 (type: int), _col82 {code} Map9's datasize is 1022672 {code} Map 9 Map Operator Tree: TableScan alias: d1 filterExpr: (d_date_sk is not null and (d_quarter_name = '2000Q1')) (type: boolean) Statistics: Num rows: 73049 Data size: 2045372 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and (d_quarter_name = '2000Q1')) (type: boolean) Statistics: Num rows: 36524 Data size: 1022672 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: d_date_sk (type: bigint) sort order: + Map-reduce partition columns: d_date_sk (type: bigint) Statistics: Num rows: 36524 Data size: 1022672 Basic stats: COMPLETE Column stats: NONE {code} There is a join of Map 9 and Reducer3 {code} Reducer 4 <- Map 9 (PARTITION-LEVEL SORT, 1), Reducer 3 (PARTITION-LEVEL SORT, 1) {code} 9223372036854775807 + 1022672 cause the problem > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16100) Dynamic Sorted Partition optimizer loses sibling operators
[ https://issues.apache.org/jira/browse/HIVE-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075834#comment-16075834 ] Ashutosh Chauhan commented on HIVE-16100: - [~gopalv] You may include a testcase from HIVE-17020 in this patch. > Dynamic Sorted Partition optimizer loses sibling operators > -- > > Key: HIVE-16100 > URL: https://issues.apache.org/jira/browse/HIVE-16100 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.2.1, 2.1.1, 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-16100.1.patch, HIVE-16100.2.patch, > HIVE-16100.2.patch, HIVE-16100.3.patch > > > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java#L173 > {code} > // unlink connection between FS and its parent > fsParent = fsOp.getParentOperators().get(0); > fsParent.getChildOperators().clear(); > {code} > The optimizer discards any cases where the fsParent has another SEL child -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075826#comment-16075826 ] Chao Sun commented on HIVE-17010: - With long you need to have ~9000 PB of {{numberOfBytes}} for the overflow to happen. It's interesting that this can occur with 3TB of input data. I'm just wondering if there's any bug in the code that caused this. > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075824#comment-16075824 ] liyunzhang_intel commented on HIVE-17010: - [~lirui],[~csun], [~ferd]: in HIVE-17010.patch, use double to replace long type to solve the problem similar bug was found in HIVE-8689. in HIVE-8689, it use [StatsUtils.safeAdd|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1626] to solve the problem. So which solution is better? 1. use double to replace Long 2. use StatsUtils.safeAdd , please give me your suggestion > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-16922: --- Attachment: (was: HIVE-16922.1.patch) > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075817#comment-16075817 ] Hive QA commented on HIVE-17047: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875826/HIVE-17047.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5900/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5900/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5900/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-06 02:32:03.136 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5900/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-06 02:32:03.140 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2a718a1 HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with just precision specified (Thomas Friedrich, reviewed by Gunther Hagleitner) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 2a718a1 HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with just precision specified (Thomas Friedrich, reviewed by Gunther Hagleitner) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-06 02:32:08.056 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:236 error: ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12875826 - PreCommit-HIVE-Build > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-17047.1.patch > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16974) Change the sort key for the schema tool validator to be
[ https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075815#comment-16075815 ] Hive QA commented on HIVE-16974: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875812/HIVE-16974.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10832 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5899/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5899/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5899/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875812 - PreCommit-HIVE-Build > Change the sort key for the schema tool validator to be > > > Key: HIVE-16974 > URL: https://issues.apache.org/jira/browse/HIVE-16974 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-16974.patch, HIVE-16974.patch > > > In HIVE-16729, we introduced ordering of results/failures returned by > schematool's validators. This allows fault injection testing to expect > results that can be verified. However, they were sorted on NAME values which > in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK > name column value, the result could be different depending on the backend > database(if they sort NULLs first or last). > So I think it is better to sort on a non-null column value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-16922: --- Status: Open (was: Patch Available) > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > Attachments: HIVE-16922.1.patch > > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-16922: --- Status: Patch Available (was: In Progress) > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > Attachments: HIVE-16922.1.patch > > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-16922: --- Attachment: HIVE-16922.1.patch The patch is based on master branch. > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > Attachments: HIVE-16922.1.patch > > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-16922 started by Bing Li. -- > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075808#comment-16075808 ] Rui Li commented on HIVE-17010: --- [~csun], we use a long to compute the sum of multiple longs. I guess that's in general a dangerous operation. > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch
[ https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075801#comment-16075801 ] Rui Li commented on HIVE-17020: --- [~ashutoshc], the following query can reproduce the issue: {code} explain from (select key from src cluster by key) a insert overwrite table d1 select a.key insert overwrite table d2 select a.key cluster by a.key; {code} The insert to table d1 will be lost. > Aggressive RS dedup can incorrectly remove OP tree branch > - > > Key: HIVE-17020 > URL: https://issues.apache.org/jira/browse/HIVE-17020 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose we have an OP tree like this: > {noformat} > ... > | > RS[1] > | > SEL[2] > /\ > SEL[3] SEL[4] > | | > RS[5] FS[6] > | > ... > {noformat} > When doing aggressive RS dedup, we'll remove all the operators between RS5 > and RS1, and thus the branch containing FS6 is lost. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist
[ https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075790#comment-16075790 ] Dan Burkert commented on HIVE-17008: I'm observing these events through the [notification log API|https://github.com/danburkert/hive/blob/master/metastore/if/hive_metastore.thrift#L1546-L1549]. Is it expected that {{ThriftHiveMetastore.get_next_notification}} returns events for failed DDL operations? There isn't any way to discern whether or not the event failed just from the {{NotificationEvent}} struct. > Where exactly is the NPE - I assume somewhere down the notifyEvent stack ? The NPE is thrown inside the notification event listener, because {{db}} can be null on [this line|https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java?utf8=%E2%9C%93#L1139]. > HiveMetastore.drop_database can return NPE if database does not exist > - > > Key: HIVE-17008 > URL: https://issues.apache.org/jira/browse/HIVE-17008 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-17008.0.patch > > > When dropping a non-existent database, the HMS will still fire registered > {{DROP_DATABASE}} event listeners. This results in an NPE when the listeners > attempt to deref the {{null}} database parameter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075778#comment-16075778 ] liyunzhang_intel commented on HIVE-17010: - [~csun]: found the problem on 3TB data. Actually the biggest table of tpc-ds "store_sales" does not exceed the max value of Long type(2^63-1). But [TPC-DS/query17|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query17.sql] is a query with many join. We use [numberOfBytes|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] to collect the numberOfBytes of sibling of specified RS. Here the sibling of specified RS maybe the result of join of big tables. The result execeed the max value of Long type. > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-17010: Description: We use [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] to collect the numberOfBytes of sibling of specified RS. We use Long type and it happens overflow when the data is too big. After happening this situation, the parallelism is decided by [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond is a dymamic value which is decided by spark runtime. For example, the value of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility that the value may be 1. The may problem here is the overflow of addition of Long type. You can reproduce the overflow problem by following code {code} public static void main(String[] args) { long a1= 9223372036854775807L; long a2=1022672; long res = a1+a2; System.out.println(res); //-9223372036853753137 BigInteger b1= BigInteger.valueOf(a1); BigInteger b2 = BigInteger.valueOf(a2); BigInteger bigRes = b1.add(b2); System.out.println(bigRes); //9223372036855798479 } {code} was: [link title|http://example.com] We use [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] to collect the numberOfBytes of sibling of specified RS. We use Long type and it happens overflow when the data is too big. After happening this situation, the parallelism is decided by [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond is a dymamic value which is decided by spark runtime. For example, the value of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility that the value may be 1. The may problem here is the overflow of addition of Long type. You can reproduce the overflow problem by following code {code} public static void main(String[] args) { long a1= 9223372036854775807L; long a2=1022672; long res = a1+a2; System.out.println(res); //-9223372036853753137 BigInteger b1= BigInteger.valueOf(a1); BigInteger b2 = BigInteger.valueOf(a2); BigInteger bigRes = b1.add(b2); System.out.println(bigRes); //9223372036855798479 } {code} > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
[ https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075769#comment-16075769 ] Mohit Sabharwal commented on HIVE-17048: Good to add the operation type to TestHs2Hooks and TestHs2HooksWithMiniKdc unit tests as well (see HIVE-8338) LGTM, otherwise. > Pass HiveOperation info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext > --- > > Key: HIVE-17048 > URL: https://issues.apache.org/jira/browse/HIVE-17048 > Project: Hive > Issue Type: Improvement > Components: Hooks >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-17048.1.patch > > > Currently hive passes the following info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext (see > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). > But the operation type (HiveOperation) is also needed in some cases, e.g., > when integrating with Sentry. > {noformat} > hookCtx.setConf(conf); > hookCtx.setUserName(userName); > hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); > hookCtx.setCommand(command); > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist
[ https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075763#comment-16075763 ] Mohit Sabharwal commented on HIVE-17008: Thanks, [~dan_impala_9180]. There are two flavors of listeners in each DDL operation, one which runs in the same transaction as the DDL event (notify only upon success) and one which run outside the transaction (can notify failed DDL operations as well). Where exactly is the NPE - I assume somewhere down the notifyEvent stack ? + [~spena], who has worked on this recently. > HiveMetastore.drop_database can return NPE if database does not exist > - > > Key: HIVE-17008 > URL: https://issues.apache.org/jira/browse/HIVE-17008 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-17008.0.patch > > > When dropping a non-existent database, the HMS will still fire registered > {{DROP_DATABASE}} event listeners. This results in an NPE when the listeners > attempt to deref the {{null}} database parameter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17045) Add HyperLogLog as an UDAF
[ https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075760#comment-16075760 ] Hive QA commented on HIVE-17045: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875790/HIVE-17045.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10832 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] (batchId=69) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5898/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5898/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5898/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875790 - PreCommit-HIVE-Build > Add HyperLogLog as an UDAF > -- > > Key: HIVE-17045 > URL: https://issues.apache.org/jira/browse/HIVE-17045 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-17045.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified
[ https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-10616: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Failures do not look related. I've committed to master. > TypeInfoUtils doesn't handle DECIMAL with just precision specified > -- > > Key: HIVE-10616 > URL: https://issues.apache.org/jira/browse/HIVE-10616 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.0.0 >Reporter: Thomas Friedrich >Assignee: Thomas Friedrich >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch > > > The parseType method in TypeInfoUtils doesn't handle decimal types with just > precision specified although that's a valid type definition. > As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return > decimal(10,0) for any decimal() string. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified
[ https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere reassigned HIVE-10616: - Assignee: Thomas Friedrich (was: Jason Dere) > TypeInfoUtils doesn't handle DECIMAL with just precision specified > -- > > Key: HIVE-10616 > URL: https://issues.apache.org/jira/browse/HIVE-10616 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.0.0 >Reporter: Thomas Friedrich >Assignee: Thomas Friedrich >Priority: Minor > Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch > > > The parseType method in TypeInfoUtils doesn't handle decimal types with just > precision specified although that's a valid type definition. > As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return > decimal(10,0) for any decimal() string. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
[ https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075702#comment-16075702 ] Aihua Xu edited comment on HIVE-17048 at 7/6/17 12:41 AM: -- patch-1: simple fix to pass HiveOperation through the context. was (Author: aihuaxu): patch-1: simple fix to pass HiveOperation to the context. > Pass HiveOperation info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext > --- > > Key: HIVE-17048 > URL: https://issues.apache.org/jira/browse/HIVE-17048 > Project: Hive > Issue Type: Improvement > Components: Hooks >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-17048.1.patch > > > Currently hive passes the following info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext (see > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). > But the operation type (HiveOperation) is also needed in some cases, e.g., > when integrating with Sentry. > {noformat} > hookCtx.setConf(conf); > hookCtx.setUserName(userName); > hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); > hookCtx.setCommand(command); > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
[ https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-17048: Status: Patch Available (was: Open) patch-1: simple fix to pass HiveOperation to the context. > Pass HiveOperation info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext > --- > > Key: HIVE-17048 > URL: https://issues.apache.org/jira/browse/HIVE-17048 > Project: Hive > Issue Type: Improvement > Components: Hooks >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-17048.1.patch > > > Currently hive passes the following info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext (see > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). > But the operation type (HiveOperation) is also needed in some cases, e.g., > when integrating with Sentry. > {noformat} > hookCtx.setConf(conf); > hookCtx.setUserName(userName); > hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); > hookCtx.setCommand(command); > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
[ https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-17048: Attachment: HIVE-17048.1.patch > Pass HiveOperation info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext > --- > > Key: HIVE-17048 > URL: https://issues.apache.org/jira/browse/HIVE-17048 > Project: Hive > Issue Type: Improvement > Components: Hooks >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-17048.1.patch > > > Currently hive passes the following info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext (see > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). > But the operation type (HiveOperation) is also needed in some cases, e.g., > when integrating with Sentry. > {noformat} > hookCtx.setConf(conf); > hookCtx.setUserName(userName); > hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); > hookCtx.setCommand(command); > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified
[ https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075688#comment-16075688 ] Gunther Hagleitner commented on HIVE-10616: --- Failures do not look related - [~jdere]? > TypeInfoUtils doesn't handle DECIMAL with just precision specified > -- > > Key: HIVE-10616 > URL: https://issues.apache.org/jira/browse/HIVE-10616 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.0.0 >Reporter: Thomas Friedrich >Assignee: Jason Dere >Priority: Minor > Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch > > > The parseType method in TypeInfoUtils doesn't handle decimal types with just > precision specified although that's a valid type definition. > As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return > decimal(10,0) for any decimal() string. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16935) Hive should strip comments from input before choosing which CommandProcessor to run.
[ https://issues.apache.org/jira/browse/HIVE-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075687#comment-16075687 ] Hive QA commented on HIVE-16935: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875796/HIVE-16935.4.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10819 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=101) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5897/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5897/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5897/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875796 - PreCommit-HIVE-Build > Hive should strip comments from input before choosing which CommandProcessor > to run. > > > Key: HIVE-16935 > URL: https://issues.apache.org/jira/browse/HIVE-16935 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-16935.1.patch, HIVE-16935.2.patch, > HIVE-16935.3.patch, HIVE-16935.4.patch > > > While using Beeswax, Hue fails to execute statement with following error: > Error while compiling statement: FAILED: ParseException line 3:4 missing > KW_ROLE at 'a' near 'a' line 3:5 missing EOF at '=' near 'a' > {quote} > -- comment > SET a=1; > SELECT 1; > {quote} > The same code works in Beeline and in Impala. > The same code fails in CliDriver > > h2. Background > Hive deals with sql comments (“-- to end of line”) in different places. > Some clients attempt to strip comments. For example BeeLine was recently > enhanced in https://issues.apache.org/jira/browse/HIVE-13864 to strip > comments from multi-line commands before they are executed. > Other clients such as Hue or Jdbc do not strip comments before sending text. > Some tests such as TestCliDriver strip comments before running tests. > When Hive gets a command the CommandProcessorFactory looks at the text to > determine which CommandProcessor should handle the command. In the bug case > the correct CommandProcessor is SetProcessor, but the comments confuse the > CommandProcessorFactory and so the command is treated as sql. Hive’s sql > parser understands and ignores comments, but it does not understand the set > commands usually handled by SetProcessor and so we get the ParseException > shown above. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Burkert updated HIVE-16993: --- Attachment: HIVE-17008.8.patch > ThriftHiveMetastore.create_database can fail if the locationUri is not set > -- > > Key: HIVE-16993 > URL: https://issues.apache.org/jira/browse/HIVE-16993 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, > HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, > HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch, HIVE-17008.8.patch > > > Calling > [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078] > with a database with an unset {{locationUri}} field through the C++ > implementation fails with: > {code} > MetaException(message=java.lang.IllegalArgumentException: Can not create a > Path from an empty string) > {code} > The > [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270] > Thrift field is 'default requiredness (implicit)', and Thrift [does not > specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] > whether unset default requiredness fields are encoded. Empirically, the Java > generated code [does not write the > {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942] > when the field is unset, while the C++ generated code > [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890]. > The MetaStore treats the field as optional, and [fills in a default > value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871] > if the field is unset. > The end result is that when the C++ implementation sends a {{Database}} > without the field set, it actually writes an empty string, and the MetaStore > treats it as a set field (non-null), and then calls a {{Path}} API which > rejects the empty string. The fix is simple: make the {{locationUri}} field > optional in metastore.thrift. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
[ https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-17048: --- > Pass HiveOperation info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext > --- > > Key: HIVE-17048 > URL: https://issues.apache.org/jira/browse/HIVE-17048 > Project: Hive > Issue Type: Improvement > Components: Hooks >Affects Versions: 2.1.1 >Reporter: Aihua Xu >Assignee: Aihua Xu > > Currently hive passes the following info to HiveSemanticAnalyzerHook through > HiveSemanticAnalyzerHookContext (see > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). > But the operation type (HiveOperation) is also needed in some cases, e.g., > when integrating with Sentry. > {noformat} > hookCtx.setConf(conf); > hookCtx.setUserName(userName); > hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); > hookCtx.setCommand(command); > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-10495) Hive index creation code throws NPE if index table is null
[ https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075657#comment-16075657 ] Ashutosh Chauhan commented on HIVE-10495: - This can still result in NPE in startFunction, if indexTable is null. Also, there is similar logic for endFunction() We shall refactor this null check so that its useful for both function. [~libing] Would you like to update your patch with that change? > Hive index creation code throws NPE if index table is null > -- > > Key: HIVE-10495 > URL: https://issues.apache.org/jira/browse/HIVE-10495 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch > > > The stack trace would be: > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) > at java.lang.reflect.Method.invoke(Method.java:611) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102) > at $Proxy9.add_index(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17036) Lineage: Minor CPU/Mem optimization for lineage transform
[ https://issues.apache.org/jira/browse/HIVE-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075644#comment-16075644 ] Ashutosh Chauhan commented on HIVE-17036: - +1 > Lineage: Minor CPU/Mem optimization for lineage transform > - > > Key: HIVE-17036 > URL: https://issues.apache.org/jira/browse/HIVE-17036 > Project: Hive > Issue Type: Improvement > Components: lineage >Reporter: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-17036.1.patch, prof_1.png, prof_2.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16832: -- Attachment: HIVE-16832.18.patch > duplicate ROW__ID possible in multi insert into transactional table > --- > > Key: HIVE-16832 > URL: https://issues.apache.org/jira/browse/HIVE-16832 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, > HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, > HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, > HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, > HIVE-16832.16.patch, HIVE-16832.17.patch, HIVE-16832.18.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075626#comment-16075626 ] Hive QA commented on HIVE-16832: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875809/HIVE-16832.17.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5896/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5896/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5896/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-05 23:32:57.513 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5896/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-05 23:32:57.516 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at c39b879 HIVE-16893: move replication dump related work in semantic analysis phase to execution phase using a task (Anishek Agarwal, reviewed by Sankar Hariappan, Daniel Dai) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at c39b879 HIVE-16893: move replication dump related work in semantic analysis phase to execution phase using a task (Anishek Agarwal, reviewed by Sankar Hariappan, Daniel Dai) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-05 23:33:02.202 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java patching file hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java patching file hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java patching file hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorImpl.java patching file hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java patching file hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/TestMutations.java patching file hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java patching file hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestMutatorImpl.java patching file ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/RecordIdentifier.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java patching file ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java patching file ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2WithSplitUpdate.java patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2WithSplitUpdateAndVectorization.java patching file
[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075621#comment-16075621 ] Hive QA commented on HIVE-16993: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875800/HIVE-17008.7.patch {color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10817 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge9] (batchId=167) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=101) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importAll (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneDb (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneFunc (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneRole (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTableNonPartitioned (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTablePartitioned (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importSecurity (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importTablesWithConstraints (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.parallel (batchId=208) org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.parallelOdd (batchId=208) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=224) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5895/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5895/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5895/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875800 - PreCommit-HIVE-Build > ThriftHiveMetastore.create_database can fail if the locationUri is not set > -- > > Key: HIVE-16993 > URL: https://issues.apache.org/jira/browse/HIVE-16993 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, > HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, > HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch > > > Calling > [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078] > with a database with an unset {{locationUri}} field through the C++ > implementation fails with: > {code} > MetaException(message=java.lang.IllegalArgumentException: Can not create a > Path from an empty string) > {code} > The > [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270] > Thrift field is 'default requiredness (implicit)', and Thrift [does not > specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] > whether unset default requiredness fields are encoded. Empirically, the Java > generated code [does not write the > {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942] > when the field is unset, while the C++ generated code > [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890]. > The MetaStore treats the field as optional, and [fills in a default > value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871] > if the field is
[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844
[ https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075593#comment-16075593 ] Mithun Radhakrishnan commented on HIVE-16908: - Yes, this is what I was afraid of. The intention of {{testTableSchemaPropagation()}} was to simulate table-propagation across different clusters/HCat instances, as Apache Falcon (or similar projects) do. I wonder if this change dilutes that intention. :/ I do recognize that the static state in {{ObjectStore}} makes this problematic. I'm trying to figure out an alternative. Question: If the target metastore instance were accessed through a different classloader, their states would be isolated, right? Would that be an acceptable solution? > Failures in TestHcatClient due to HIVE-16844 > > > Key: HIVE-16908 > URL: https://issues.apache.org/jira/browse/HIVE-16908 > Project: Hive > Issue Type: Bug >Reporter: Sunitha Beeram >Assignee: Sunitha Beeram > Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch > > > Some of the tests in TestHCatClient.java, for ex: > {noformat} > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation > (batchId=177) > {noformat} > are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new > configuration object is set on the ObjectStore. TestHCatClient fires up a > second instance of metastore thread with a different conf object that results > in the PersistenceMangaerFactory closure and hence tests fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16844) Fix Connection leak in ObjectStore when new Conf object is used
[ https://issues.apache.org/jira/browse/HIVE-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075583#comment-16075583 ] Mithun Radhakrishnan commented on HIVE-16844: - Sorry to resurrect this discussion. I was pondering over the solution on HIVE-16908, and wondered whether the solution here is complete. Here's the code to {{ObjectStore::setConf()}}: {code:java|title=ObjectStore.java} @Override @SuppressWarnings("nls") public void setConf(Configuration conf) { // Although an instance of ObjectStore is accessed by one thread, there may // be many threads with ObjectStore instances. So the static variables // pmf and prop need to be protected with locks. pmfPropLock.lock(); try { isInitialized = false; hiveConf = conf; configureSSL(conf); Properties propsFromConf = getDataSourceProps(conf); boolean propsChanged = !propsFromConf.equals(prop); if (propsChanged) { if (pmf != null){ clearOutPmfClassLoaderCache(pmf); // close the underlying connection pool to avoid leaks pmf.close(); } pmf = null; prop = null; } ... } {code} Note that {{pmfPropLock}} is locked before {{pmf.close()}} is called. But this is also the only place where {{pmfPropLock}} is used. So, if another thread is in the middle of accessing {{pmf}}, it is possible that the instance is messed up for that thread. Before this code change, resetting {{pmf}} would not affect any threads with an outstanding reference. > Fix Connection leak in ObjectStore when new Conf object is used > --- > > Key: HIVE-16844 > URL: https://issues.apache.org/jira/browse/HIVE-16844 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Sunitha Beeram >Assignee: Sunitha Beeram > Fix For: 3.0.0 > > Attachments: HIVE-16844.1.patch > > > The code path in ObjectStore.java currently leaks BoneCP (or Hikari) > connection pools when a new configuration object is passed in. The code needs > to ensure that the persistence-factory is closed before it is nullified. > The relevant code is > [here|https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L290]. > Note that pmf is set to null, but the underlying connection pool is not > closed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075566#comment-16075566 ] Mohit Sabharwal commented on HIVE-17022: Test failures are unrelated. > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.1.patch, HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-17047: Fix Version/s: (was: 1.2.1) > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-17047.1.patch > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-17047: Attachment: HIVE-17047.1.patch > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-17047.1.patch > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-17047: Status: Patch Available (was: Open) > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-17047.1.patch > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-17047: Target Version/s: 1.2.1 > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-17047.1.patch > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075556#comment-16075556 ] Zhiyuan Yang commented on HIVE-17047: - It turns out HIVE-15147 has fix for this accidentally. HIVE-15147 was for Hive 2.2.0 LLAP but not for earlier versions. Uploading a partial patch from HIVE-15147 for earlier versions. > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Fix For: 1.2.1 > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work
[ https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang reassigned HIVE-17047: --- > Allow table property to be populated to jobConf to make > FixedLengthInputFormat work > --- > > Key: HIVE-17047 > URL: https://issues.apache.org/jira/browse/HIVE-17047 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Fix For: 1.2.1 > > > To make FixedLengthInputFormat work in Hive, we need table specific value for > the configuration "fixedlengthinputformat.record.length". Right now the best > place would be table property. Unfortunately, table property is not alway > populated to InputFormat configurations because of this in HiveInputFormat: > {code} > PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString()); > if ((part != null) && (part.getTableDesc() != null)) { > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075551#comment-16075551 ] Hive QA commented on HIVE-17022: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875801/HIVE-17022.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10831 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5894/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5894/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5894/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12875801 - PreCommit-HIVE-Build > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.1.patch, HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075516#comment-16075516 ] Ashutosh Chauhan commented on HIVE-16996: - Yeah we should do it in 2 steps. First for hll udaf and metastore changes later. Also, instead of adding new udaf you can overload existing compute_stats() udaf so that we can reuse logic of other stats of the udaf. > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16966.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17001) Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
[ https://issues.apache.org/jira/browse/HIVE-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075421#comment-16075421 ] Naveen Gangam commented on HIVE-17001: -- [~zsombor.klara] Quick qs on the issue. I am a bit confused between the jira summary and the reproducer. Summary says "insert overwrite" but the reproducer does not use "insert overwrite". So I am wondering if the reproducer is intended to be the same as written. I am not sure if this is a bug. Say, you execute the following INSERT INTO test PARTITION(ds='p1') values ('a'); INSERT INTO test PARTITION(ds='p1') values ('a'); The resultant partition directory should contain 2 data files and a select * on the table should return 2 rows. This is by design. The testcase in this jira is semantically similar to the case above, where you have some existing data in a partition and you are inserting additional data. Would you agree? Normally, step 4 of the reproducer should have deleted the data for the partition, had it existed. But I think it is legal to manage some or all of the partition data externally, as well. Am I making sense? Thanks > Insert overwrite table doesn't clean partition directory on HDFS if partition > is missing from HMS > - > > Key: HIVE-17001 > URL: https://issues.apache.org/jira/browse/HIVE-17001 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Attachments: HIVE-17001.01.patch > > > Insert overwrite table should clear existing data before creating the new > data files. > For a partitioned table we will clean any folder of existing partitions on > HDFS, however if the partition folder exists only on HDFS and the partition > definition is missing in HMS, the folder is not cleared. > Reproduction steps: > 1. CREATE TABLE test( col1 string) PARTITIONED BY (ds string); > 2. INSERT INTO test PARTITION(ds='p1') values ('a'); > 3. Copy the data to a different folder with different name. > 4. ALTER TABLE test DROP PARTITION (ds='p1'); > 5. Recreate the partition directory, copy and rename the data file back > 6. INSERT INTO test PARTITION(ds='p1') values ('b'); > 7. SELECT * from test; > will result in 2 records being returned instead of 1. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17001) Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS
[ https://issues.apache.org/jira/browse/HIVE-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075406#comment-16075406 ] Sergio Peña commented on HIVE-17001: [~zsombor.klara] I didn't understand the test case. {noformat} # One partition dt='p1' with row ("a",1) is added insert into test_part partition(dt = 'p1') values ("a", 1); # Partition metadata is removed only (no data because it is an external table) alter table test_part drop partition (dt='p1'); # Data is moved dfs -mv ${system:test.tmp.dir}/test/dt=p1/00_0 ${system:test.tmp.dir}/test/dt=p1/00_1; # Partition is re-created with dt='p1" with row ("b",2) insert overwrite table test_part partition(dt = 'p1') values ("b", 2); # This is correct, only one row is seen because the row ("a",1) was moved to another location manually. # Where is the issue here? select * from test_part; {noformat} > Insert overwrite table doesn't clean partition directory on HDFS if partition > is missing from HMS > - > > Key: HIVE-17001 > URL: https://issues.apache.org/jira/browse/HIVE-17001 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Attachments: HIVE-17001.01.patch > > > Insert overwrite table should clear existing data before creating the new > data files. > For a partitioned table we will clean any folder of existing partitions on > HDFS, however if the partition folder exists only on HDFS and the partition > definition is missing in HMS, the folder is not cleared. > Reproduction steps: > 1. CREATE TABLE test( col1 string) PARTITIONED BY (ds string); > 2. INSERT INTO test PARTITION(ds='p1') values ('a'); > 3. Copy the data to a different folder with different name. > 4. ALTER TABLE test DROP PARTITION (ds='p1'); > 5. Recreate the partition directory, copy and rename the data file back > 6. INSERT INTO test PARTITION(ds='p1') values ('b'); > 7. SELECT * from test; > will result in 2 records being returned instead of 1. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be
[ https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-16974: - Status: Patch Available (was: Open) > Change the sort key for the schema tool validator to be > > > Key: HIVE-16974 > URL: https://issues.apache.org/jira/browse/HIVE-16974 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-16974.patch, HIVE-16974.patch > > > In HIVE-16729, we introduced ordering of results/failures returned by > schematool's validators. This allows fault injection testing to expect > results that can be verified. However, they were sorted on NAME values which > in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK > name column value, the result could be different depending on the backend > database(if they sort NULLs first or last). > So I think it is better to sort on a non-null column value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be
[ https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-16974: - Attachment: HIVE-16974.patch > Change the sort key for the schema tool validator to be > > > Key: HIVE-16974 > URL: https://issues.apache.org/jira/browse/HIVE-16974 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-16974.patch, HIVE-16974.patch > > > In HIVE-16729, we introduced ordering of results/failures returned by > schematool's validators. This allows fault injection testing to expect > results that can be verified. However, they were sorted on NAME values which > in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK > name column value, the result could be different depending on the backend > database(if they sort NULLs first or last). > So I think it is better to sort on a non-null column value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be
[ https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-16974: - Status: Open (was: Patch Available) The pre-commits havent been kicked off for some reason. Will re-attach the patch. > Change the sort key for the schema tool validator to be > > > Key: HIVE-16974 > URL: https://issues.apache.org/jira/browse/HIVE-16974 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-16974.patch, HIVE-16974.patch > > > In HIVE-16729, we introduced ordering of results/failures returned by > schematool's validators. This allows fault injection testing to expect > results that can be verified. However, they were sorted on NAME values which > in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK > name column value, the result could be different depending on the backend > database(if they sort NULLs first or last). > So I think it is better to sort on a non-null column value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16832: -- Attachment: HIVE-16832.17.patch > duplicate ROW__ID possible in multi insert into transactional table > --- > > Key: HIVE-16832 > URL: https://issues.apache.org/jira/browse/HIVE-16832 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, > HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, > HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, > HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, > HIVE-16832.16.patch, HIVE-16832.17.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism
[ https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075268#comment-16075268 ] Chao Sun commented on HIVE-17010: - bq. We use Long type and it happens overflow when the data is too big. I don't understand. How it could overflow with long type? how large is the dataset you used for testing? > Fix the overflow problem of Long type in SetSparkReducerParallelism > --- > > Key: HIVE-17010 > URL: https://issues.apache.org/jira/browse/HIVE-17010 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-17010.1.patch > > > [link title|http://example.com] We use > [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129] > to collect the numberOfBytes of sibling of specified RS. We use Long type > and it happens overflow when the data is too big. After happening this > situation, the parallelism is decided by > [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184] > if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond > is a dymamic value which is decided by spark runtime. For example, the value > of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility > that the value may be 1. The may problem here is the overflow of addition of > Long type. You can reproduce the overflow problem by following code > {code} > public static void main(String[] args) { > long a1= 9223372036854775807L; > long a2=1022672; > long res = a1+a2; > System.out.println(res); //-9223372036853753137 > BigInteger b1= BigInteger.valueOf(a1); > BigInteger b2 = BigInteger.valueOf(a2); > BigInteger bigRes = b1.add(b2); > System.out.println(bigRes); //9223372036855798479 > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844
[ https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075241#comment-16075241 ] Mithun Radhakrishnan edited comment on HIVE-16908 at 7/5/17 6:53 PM: - I'll need a little time to review. On the face of it, this change is disconcerting, since it looks like this changes the intention of the tests added in HIVE-7341. :/ Let me take a closer look. was (Author: mithun): I'll need a little time to review. On the face of it, this change is disconcerting, since it looks like changes the intention of the tests added in HIVE-7341. :/ Let me take a closer look. > Failures in TestHcatClient due to HIVE-16844 > > > Key: HIVE-16908 > URL: https://issues.apache.org/jira/browse/HIVE-16908 > Project: Hive > Issue Type: Bug >Reporter: Sunitha Beeram >Assignee: Sunitha Beeram > Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch > > > Some of the tests in TestHCatClient.java, for ex: > {noformat} > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation > (batchId=177) > {noformat} > are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new > configuration object is set on the ObjectStore. TestHCatClient fires up a > second instance of metastore thread with a different conf object that results > in the PersistenceMangaerFactory closure and hence tests fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844
[ https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075241#comment-16075241 ] Mithun Radhakrishnan commented on HIVE-16908: - I'll need a little time to review. On the face of it, this change is disconcerting, since it looks like changes the intention of the tests added in HIVE-7341. :/ Let me take a closer look. > Failures in TestHcatClient due to HIVE-16844 > > > Key: HIVE-16908 > URL: https://issues.apache.org/jira/browse/HIVE-16908 > Project: Hive > Issue Type: Bug >Reporter: Sunitha Beeram >Assignee: Sunitha Beeram > Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch > > > Some of the tests in TestHCatClient.java, for ex: > {noformat} > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation > (batchId=177) > {noformat} > are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new > configuration object is set on the ObjectStore. TestHCatClient fires up a > second instance of metastore thread with a different conf object that results > in the PersistenceMangaerFactory closure and hence tests fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075209#comment-16075209 ] Naveen Gangam commented on HIVE-17022: -- I only looked at the {{public}} access for the lock() but did not realize it wasnt being called from outside. Makes sense to make it private in this case. Thanks for the changes. The patch looks good to me. +1 pending tests > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.1.patch, HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch
[ https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075206#comment-16075206 ] Ashutosh Chauhan commented on HIVE-17020: - [~lirui] If you have test case for it, can you please share it. It will be good to add it as part of HIVE-16100 fix. > Aggressive RS dedup can incorrectly remove OP tree branch > - > > Key: HIVE-17020 > URL: https://issues.apache.org/jira/browse/HIVE-17020 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose we have an OP tree like this: > {noformat} > ... > | > RS[1] > | > SEL[2] > /\ > SEL[3] SEL[4] > | | > RS[5] FS[6] > | > ... > {noformat} > When doing aggressive RS dedup, we'll remove all the operators between RS5 > and RS1, and thus the branch containing FS6 is lost. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16998) Add config to enable HoS DPP only for map-joins
[ https://issues.apache.org/jira/browse/HIVE-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-16998: --- Assignee: Janaki Lahorani (was: Sahil Takiar) > Add config to enable HoS DPP only for map-joins > --- > > Key: HIVE-16998 > URL: https://issues.apache.org/jira/browse/HIVE-16998 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Janaki Lahorani > > HoS DPP will split a given operator tree in two under the following > conditions: it has detected that the query can benefit from DPP, and the > filter is not a map-join (see SplitOpTreeForDPP). > This can hurt performance if the the non-partitioned side of the join > involves a complex operator tree - e.g. the query {{select count(*) from > srcpart where srcpart.ds in (select max(srcpart.ds) from srcpart union all > select min(srcpart.ds) from srcpart)}} will require running the subquery > twice, once in each Spark job. > Queries with map-joins don't get split into two operator trees and thus don't > suffer from this drawback. Thus, it would be nice to have a config key that > just enables DPP on HoS for map-joins. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-17022: --- Attachment: HIVE-17022.1.patch > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.1.patch, HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075199#comment-16075199 ] Mohit Sabharwal commented on HIVE-17022: Thanks. Note that Logger is from slf4j not log4j, so there is no extra cost of string formatting. But its probably better to add the conditional for readability. The other lock is really just a helper, so making it private. Printed sorted locks is a great idea. Updating patch. > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.1.patch, HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Burkert updated HIVE-16993: --- Attachment: HIVE-17008.7.patch > ThriftHiveMetastore.create_database can fail if the locationUri is not set > -- > > Key: HIVE-16993 > URL: https://issues.apache.org/jira/browse/HIVE-16993 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, > HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, > HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch > > > Calling > [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078] > with a database with an unset {{locationUri}} field through the C++ > implementation fails with: > {code} > MetaException(message=java.lang.IllegalArgumentException: Can not create a > Path from an empty string) > {code} > The > [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270] > Thrift field is 'default requiredness (implicit)', and Thrift [does not > specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] > whether unset default requiredness fields are encoded. Empirically, the Java > generated code [does not write the > {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942] > when the field is unset, while the C++ generated code > [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890]. > The MetaStore treats the field as optional, and [fills in a default > value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871] > if the field is unset. > The end result is that when the C++ implementation sends a {{Database}} > without the field set, it actually writes an empty string, and the MetaStore > treats it as a set field (non-null), and then calls a {{Path}} API which > rejects the empty string. The fix is simple: make the {{locationUri}} field > optional in metastore.thrift. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16935) Hive should strip comments from input before choosing which CommandProcessor to run.
[ https://issues.apache.org/jira/browse/HIVE-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Sherman updated HIVE-16935: -- Attachment: HIVE-16935.4.patch > Hive should strip comments from input before choosing which CommandProcessor > to run. > > > Key: HIVE-16935 > URL: https://issues.apache.org/jira/browse/HIVE-16935 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > Attachments: HIVE-16935.1.patch, HIVE-16935.2.patch, > HIVE-16935.3.patch, HIVE-16935.4.patch > > > While using Beeswax, Hue fails to execute statement with following error: > Error while compiling statement: FAILED: ParseException line 3:4 missing > KW_ROLE at 'a' near 'a' line 3:5 missing EOF at '=' near 'a' > {quote} > -- comment > SET a=1; > SELECT 1; > {quote} > The same code works in Beeline and in Impala. > The same code fails in CliDriver > > h2. Background > Hive deals with sql comments (“-- to end of line”) in different places. > Some clients attempt to strip comments. For example BeeLine was recently > enhanced in https://issues.apache.org/jira/browse/HIVE-13864 to strip > comments from multi-line commands before they are executed. > Other clients such as Hue or Jdbc do not strip comments before sending text. > Some tests such as TestCliDriver strip comments before running tests. > When Hive gets a command the CommandProcessorFactory looks at the text to > determine which CommandProcessor should handle the command. In the bug case > the correct CommandProcessor is SetProcessor, but the comments confuse the > CommandProcessorFactory and so the command is treated as sql. Hive’s sql > parser understands and ignores comments, but it does not understand the set > commands usually handled by SetProcessor and so we get the ParseException > shown above. > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075167#comment-16075167 ] Naveen Gangam commented on HIVE-17022: -- Thanks for the explanation. The fix in the patch looks good to me. Just a couple of nits. 1) Think we should make the code above conditional, only when DEBUG is enabled. So perhaps something like this {code} if (LOG.isDebugEnabled()) { for (HiveLockObj obj : objs) { LOG.debug("Acquiring lock for {} with mode {}", obj.getObj().getName(), obj.getMode()); } } {code} 2) Is there a reason the above code is not in {{public List lock(List objs, int numRetriesForLock, long sleepTime)}} method but at a higher level? would it be better if we log these after the {{sortLocks}} call so we print the sorted list? Thanks > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16893) move replication dump related work in semantic analysis phase to execution phase using a task
[ https://issues.apache.org/jira/browse/HIVE-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16893: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) HIVE-16893.4.patch pushed to master. Thanks Anishek, Sankar! > move replication dump related work in semantic analysis phase to execution > phase using a task > - > > Key: HIVE-16893 > URL: https://issues.apache.org/jira/browse/HIVE-16893 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16893.2.patch, HIVE-16893.3.patch, > HIVE-16893.4.patch > > > Since we run in to the possibility of creating a large number tasks during > replication bootstrap dump > * we may not be able to hold all of them in memory for really large > databases, which might not hold true once we complete HIVE-16892 > * Also a compile time lock is taken such that only one query is run in this > phase which in replication bootstrap scenario is going to be a very long > running task and hence moving it to execution phase will limit the lock > period in compile phase. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17007) NPE introduced by HIVE-16871
[ https://issues.apache.org/jira/browse/HIVE-17007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-17007: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Patch pushed to master. Thanks Sushanth for review! > NPE introduced by HIVE-16871 > > > Key: HIVE-17007 > URL: https://issues.apache.org/jira/browse/HIVE-17007 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 3.0.0 > > Attachments: HIVE-17007.1.patch > > > Stack: > {code} > 2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - > MetaException(message:java.lang.NullPointerException) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944) > at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325) > at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) > at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown > Source) > at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306) > at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown > Source) > at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624) > at > org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) > at > org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at >
[jira] [Commented] (HIVE-17007) NPE introduced by HIVE-16871
[ https://issues.apache.org/jira/browse/HIVE-17007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075136#comment-16075136 ] Sushanth Sowmyan commented on HIVE-17007: - +1 > NPE introduced by HIVE-16871 > > > Key: HIVE-17007 > URL: https://issues.apache.org/jira/browse/HIVE-17007 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-17007.1.patch > > > Stack: > {code} > 2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - > MetaException(message:java.lang.NullPointerException) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944) > at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325) > at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) > at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown > Source) > at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306) > at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown > Source) > at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624) > at > org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) > at > org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.cache.SharedCache.getCachedTableColStats(SharedCache.java:140) > at >
[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075137#comment-16075137 ] Hive QA commented on HIVE-16993: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875788/HIVE-17008.6.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5893/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5893/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5893/ Messages: {noformat} This message was trimmed, see log for full details (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map) is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database) is not applicable (actual and formal argument lists differ in length) [ERROR] /data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseStoreIntegration.java:[1311,21] no suitable constructor found for Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map ) constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map ) is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database) is not applicable (actual and formal argument lists differ in length) [ERROR] /data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[277,9] no suitable constructor found for Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map ) constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map ) is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database) is not applicable (actual and formal argument lists differ in length) [ERROR] /data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[324,9] no suitable constructor found for Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map ) constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map ) is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database) is not applicable (actual and formal argument lists differ in length) [ERROR] /data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[365,9] no suitable constructor found for Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map ) constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map ) is not applicable (actual and formal argument lists differ in length) constructor org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database) is not applicable (actual and formal argument lists differ in length) [ERROR] /data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[433,9] no suitable constructor found for
[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075128#comment-16075128 ] Pengcheng Xiong commented on HIVE-16996: [~ashutoshc] and [~hagleitn], it seems that if we shift from FM to HLL, we will already have lots of plan changes. Let alone the aggregation of partition stats. Do you want to try it in a single step (i.e., replace FM with HLL and do the aggr) or split into 2 steps? Thanks. > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16966.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17045) Add HyperLogLog as an UDAF
[ https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-17045: --- Attachment: HIVE-17045.01.patch > Add HyperLogLog as an UDAF > -- > > Key: HIVE-17045 > URL: https://issues.apache.org/jira/browse/HIVE-17045 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-17045.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17045) Add HyperLogLog as an UDAF
[ https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-17045: --- Status: Patch Available (was: Open) > Add HyperLogLog as an UDAF > -- > > Key: HIVE-17045 > URL: https://issues.apache.org/jira/browse/HIVE-17045 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-17045.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17045) Add HyperLogLog as an UDAF
[ https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-17045: -- > Add HyperLogLog as an UDAF > -- > > Key: HIVE-17045 > URL: https://issues.apache.org/jira/browse/HIVE-17045 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16961) Hive on Spark leaks spark application in case user cancels query and closes session
[ https://issues.apache.org/jira/browse/HIVE-16961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16961: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks to Rui for the review. > Hive on Spark leaks spark application in case user cancels query and closes > session > --- > > Key: HIVE-16961 > URL: https://issues.apache.org/jira/browse/HIVE-16961 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 3.0.0 > > Attachments: HIVE-16961.patch, HIVE-16961.patch > > > It's found that a Spark application is leaked when user cancels query and > closes the session while Hive is waiting for remote driver to connect back. > This is found for asynchronous query execution, but seemingly equally > applicable for synchronous submission when session is abruptly closed. The > leaked Spark application that runs Spark driver connects back to Hive > successfully and run for ever (until HS2 restarts), but receives no job > submission because the session is already closed. Ideally, Hive should > rejects the connection from the driver so the driver will exist. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-16962) Better error msg for Hive on Spark in case user cancels query and closes session
[ https://issues.apache.org/jira/browse/HIVE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074240#comment-16074240 ] Xuefu Zhang edited comment on HIVE-16962 at 7/5/17 5:23 PM: I updated the fix version as 3.0.0. There are too many pending releases. It's very confusing. was (Author: xuefuz): @lefty, thank for pointing it out. I lost track of the releases. I committed it to master and have no plan to commit to other branches. What's the right fix version then? > Better error msg for Hive on Spark in case user cancels query and closes > session > > > Key: HIVE-16962 > URL: https://issues.apache.org/jira/browse/HIVE-16962 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 3.0.0 > > Attachments: HIVE-16962.2.patch, HIVE-16962.patch, HIVE-16962.patch > > > In case user cancels a query and closes the session, Hive marks the query as > failed. However, the error message is a little confusing. It still says: > {quote} > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark > client. This is likely because the queue you assigned to does not have free > resource at the moment to start the job. Please check your queue usage and > try the query again later. > {quote} > followed by some InterruptedException. > Ideally, the error should clearly indicates the fact that user cancels the > execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16962) Better error msg for Hive on Spark in case user cancels query and closes session
[ https://issues.apache.org/jira/browse/HIVE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-16962: --- Fix Version/s: (was: 2.2.0) 3.0.0 > Better error msg for Hive on Spark in case user cancels query and closes > session > > > Key: HIVE-16962 > URL: https://issues.apache.org/jira/browse/HIVE-16962 > Project: Hive > Issue Type: Improvement > Components: Spark >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 3.0.0 > > Attachments: HIVE-16962.2.patch, HIVE-16962.patch, HIVE-16962.patch > > > In case user cancels a query and closes the session, Hive marks the query as > failed. However, the error message is a little confusing. It still says: > {quote} > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark > client. This is likely because the queue you assigned to does not have free > resource at the moment to start the job. Please check your queue usage and > try the query again later. > {quote} > followed by some InterruptedException. > Ideally, the error should clearly indicates the fact that user cancels the > execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17018) Small table is converted to map join even the total size of small tables exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)
[ https://issues.apache.org/jira/browse/HIVE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075106#comment-16075106 ] Xuefu Zhang commented on HIVE-17018: Ping [~csun] for comments. > Small table is converted to map join even the total size of small tables > exceeds the threshold(hive.auto.convert.join.noconditionaltask.size) > - > > Key: HIVE-17018 > URL: https://issues.apache.org/jira/browse/HIVE-17018 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > > we use "hive.auto.convert.join.noconditionaltask.size" as the threshold. it > means the sum of size for n-1 of the tables/partitions for a n-way join is > smaller than it, it will be converted to a map join. for example, A join B > join C join D join E. Big table is A(100M), small tables are > B(10M),C(10M),D(10M),E(10M). If we set > hive.auto.convert.join.noconditionaltask.size=20M. In current code, E,D,B > will be converted to map join but C will not be converted to map join. In my > understanding, because hive.auto.convert.join.noconditionaltask.size can only > contain E and D, so C and B should not be converted to map join. > Let's explain more why E can be converted to map join. > in current code, > [SparkMapJoinOptimizer#getConnectedMapJoinSize|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L364] > calculates all the mapjoins in the parent path and child path. The search > stops when encountering [UnionOperator or > ReduceOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L381]. > Because C is not converted to map join because {{connectedMapJoinSize + > totalSize) > maxSize}} [see > code|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L330].The > RS before the join of C remains. When calculating whether B will be > converted to map join, {{getConnectedMapJoinSize}} returns 0 as encountering > [RS > |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#409] > and causes {{connectedMapJoinSize + totalSize) < maxSize}} matches. > [~xuefuz] or [~jxiang]: can you help see whether this is a bug or not as you > are more familiar with SparkJoinOptimizer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set
[ https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Burkert updated HIVE-16993: --- Attachment: HIVE-17008.6.patch > ThriftHiveMetastore.create_database can fail if the locationUri is not set > -- > > Key: HIVE-16993 > URL: https://issues.apache.org/jira/browse/HIVE-16993 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, > HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, > HIVE-16993.5.patch, HIVE-17008.6.patch > > > Calling > [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078] > with a database with an unset {{locationUri}} field through the C++ > implementation fails with: > {code} > MetaException(message=java.lang.IllegalArgumentException: Can not create a > Path from an empty string) > {code} > The > [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270] > Thrift field is 'default requiredness (implicit)', and Thrift [does not > specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] > whether unset default requiredness fields are encoded. Empirically, the Java > generated code [does not write the > {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942] > when the field is unset, while the C++ generated code > [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890]. > The MetaStore treats the field as optional, and [fills in a default > value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871] > if the field is unset. > The end result is that when the C++ implementation sends a {{Database}} > without the field set, it actually writes an empty string, and the MetaStore > treats it as a set field (non-null), and then calls a {{Path}} API which > rejects the empty string. The fix is simple: make the {{locationUri}} field > optional in metastore.thrift. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist
[ https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075068#comment-16075068 ] Dan Burkert commented on HIVE-17008: Hi [~mohitsabharwal], here is a better diff of the patch: https://github.com/danburkert/hive/commit/ec115a584c4b2b715f339458afc2bcbf353d1e47?w=1. Since filing this bug / uploading the patch I've found that the HMS can fire event listeners on almost any type of failed DDL operation: drop database, create table, partitions, functions, indices, etc. The patch only fixes the drop database case, but the fix is pretty much the same. It's not clear to me what the designed behavior is, though. Are these just copy/pasted bugs, or is it by design that the HMS notifies listeners for failed DDL operations? > HiveMetastore.drop_database can return NPE if database does not exist > - > > Key: HIVE-17008 > URL: https://issues.apache.org/jira/browse/HIVE-17008 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-17008.0.patch > > > When dropping a non-existent database, the HMS will still fire registered > {{DROP_DATABASE}} event listeners. This results in an NPE when the listeners > attempt to deref the {{null}} database parameter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16355) Service: embedded mode should only be available if service is loaded onto the classpath
[ https://issues.apache.org/jira/browse/HIVE-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075036#comment-16075036 ] Zoltan Haindrich commented on HIVE-16355: - [~thejas], [~vgumashta], [~hagleitn] could you please take a look? > Service: embedded mode should only be available if service is loaded onto the > classpath > --- > > Key: HIVE-16355 > URL: https://issues.apache.org/jira/browse/HIVE-16355 > Project: Hive > Issue Type: Sub-task > Components: Metastore, Server Infrastructure >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-16355.1.patch, HIVE-16355.2.patch, > HIVE-16355.2.patch, HIVE-16355.3.patch, HIVE-16355.4.patch > > > I would like to relax the hard reference to > {{EmbeddedThriftBinaryCLIService}} to be only used in case {{service}} module > is loaded onto the classpath. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075030#comment-16075030 ] Mohit Sabharwal commented on HIVE-17022: For a table will large number of partitions, it will indeed print lot of log statements. But only applicable under debug mode, when you are debugging and could use this info. For a complex query involving many write entities, it is hard to tell what lock is being taken for what entity otherwise. Earlier, we had these statements as INFO, which we changed to DEBUG in HIVE-12966 to avoid noise. We already have the debug statement for ZooKeeperHiveLockManager, but did not print the actual lock mode in that statement, so I added that to this patch. For EmbeddedLockManager, which is useful when debugging locally, we had no debug statements whatsoever, so I added that to this patch. > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075017#comment-16075017 ] Naveen Gangam commented on HIVE-17022: -- [~mohitsabharwal] I am a bit worried that the following code generates a lot of noise in the logs, given the frequency at which lock() method is called and the number of locks that can be held at any particular point in time. Have you seen otherwise? {code} for (HiveLockObj obj : objs) { LOG.debug("Acquiring lock for {} with mode {}", obj.getObj().getName(), obj.getMode()); } {code} Thanks > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist
[ https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074983#comment-16074983 ] Mohit Sabharwal commented on HIVE-17008: [~dan_impala_9180], the changes aren't clear to me, look like changes in indentation. Could you add a review board link ? Also, could you add the NPE stracktrace you are seeing to the description ? You also need a unit test here, perhaps in TestDbNotificationListener > HiveMetastore.drop_database can return NPE if database does not exist > - > > Key: HIVE-17008 > URL: https://issues.apache.org/jira/browse/HIVE-17008 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Dan Burkert >Assignee: Dan Burkert > Attachments: HIVE-17008.0.patch > > > When dropping a non-existent database, the HMS will still fire registered > {{DROP_DATABASE}} event listeners. This results in an NPE when the listeners > attempt to deref the {{null}} database parameter. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16883) HBaseStorageHandler Ignores Case for HBase Table Name
[ https://issues.apache.org/jira/browse/HIVE-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li reassigned HIVE-16883: -- Assignee: Bing Li > HBaseStorageHandler Ignores Case for HBase Table Name > - > > Key: HIVE-16883 > URL: https://issues.apache.org/jira/browse/HIVE-16883 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 1.2.1 > Environment: Hortonworks HDP 2.6.0.3, CentOS 7.0, VMWare ESXI >Reporter: Shawn Weeks >Assignee: Bing Li >Priority: Minor > > Currently the HBaseStorageHandler is lower casing the HBase Table name. This > prevent use of the storage handler with existing HBase tables that are not > all lower case. Looking at the source this was done intentionally but I > haven't found any documentation about why on the wiki. To prevent a change in > the default behavior I'd suggest adding an additional property to the serde. > {code} > create 'TestTable', 'd' > create external table `TestTable` ( > id bigint, > hash String, > location String, > name String > ) > stored by "org.apache.hadoop.hive.hbase.HBaseStorageHandler" > with serdeproperties ( > "hbase.columns.mapping" = ":key,d:hash,d:location,d:name", > "hbase.table.name" = "TestTable" > ); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li reassigned HIVE-16922: -- Assignee: Bing Li > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-1938) Cost Based Query optimization for Joins in Hive
[ https://issues.apache.org/jira/browse/HIVE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-1938. --- Resolution: Duplicate > Cost Based Query optimization for Joins in Hive > --- > > Key: HIVE-1938 > URL: https://issues.apache.org/jira/browse/HIVE-1938 > Project: Hive > Issue Type: Improvement > Components: Query Processor, Statistics > Environment: *nix,java >Reporter: bharath v >Assignee: bharath v > > Current optimization in Hive is just rule-based and involves applying a set > of rules on the Plan tree. This depends on hints given by the user (which may > or may-not be correct) and might result in execution of costlier plans.So > this jira aims at building a cost-model which can give a good estimate > various plans before hand (using some meta-data already collected) and we can > choose the best plan which incurs the least cost. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-33) [Hive]: Add optimizer statistics in Hive
[ https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-33. - Resolution: Duplicate > [Hive]: Add optimizer statistics in Hive > > > Key: HIVE-33 > URL: https://issues.apache.org/jira/browse/HIVE-33 > Project: Hive > Issue Type: New Feature > Components: Query Processor, Statistics >Reporter: Ashish Thusoo > Labels: statistics > > Add commands to collect partition and column level statistics in hive. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17022) Add mode in lock debug statements
[ https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074828#comment-16074828 ] Mohit Sabharwal commented on HIVE-17022: Test failures are unrelated. > Add mode in lock debug statements > - > > Key: HIVE-17022 > URL: https://issues.apache.org/jira/browse/HIVE-17022 > Project: Hive > Issue Type: Improvement > Components: Locking >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Trivial > Attachments: HIVE-17022.patch > > > Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode, > whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful > when debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-787) Hive Freeway - support near-realtime data processing
[ https://issues.apache.org/jira/browse/HIVE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-787. -- Resolution: Won't Fix The rise of n million stream processing solutions makes it unlikely anyone would attempt to implement this directly. It looks like people are using calcite in real time platforms like Samza so in effect I would say this was done in another way. Reopen if you feel differently. > Hive Freeway - support near-realtime data processing > > > Key: HIVE-787 > URL: https://issues.apache.org/jira/browse/HIVE-787 > Project: Hive > Issue Type: New Feature >Reporter: Zheng Shao > > Most people are using Hive for daily (or at most hourly) data processing. > We want to explore what are the obstacles for using Hive for 15 minutes, 5 > minutes or even 1 minute data processing intervals, and remove these > obstacles. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17026) HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master
[ https://issues.apache.org/jira/browse/HIVE-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074762#comment-16074762 ] Fei Hui commented on HIVE-17026: Hi, [~cartershanklin] i see the source code, Hive {{Version = 3.0.0}} only provides hive2 driver. So there is error info 'java.sql.SQLException: No suitable driver found for jdbc:hive://' > HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master > --- > > Key: HIVE-17026 > URL: https://issues.apache.org/jira/browse/HIVE-17026 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Fei Hui > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > The docs at http://www.hplsql.org/configuration#hplsqlconnhiveconn state: > > hplsql.conn.hiveconn > org.apache.hadoop.hive.jdbc.HiveDriver;jdbc:hive:// > > If you use that on current master you get: > java.sql.SQLException: No suitable driver found for jdbc:hive:// > If you use hive2 it works fine. It's not clear to me if that's a change from > prior versions or not. > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17026) HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master
[ https://issues.apache.org/jira/browse/HIVE-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui reassigned HIVE-17026: -- Assignee: Fei Hui > HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master > --- > > Key: HIVE-17026 > URL: https://issues.apache.org/jira/browse/HIVE-17026 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Fei Hui > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > The docs at http://www.hplsql.org/configuration#hplsqlconnhiveconn state: > > hplsql.conn.hiveconn > org.apache.hadoop.hive.jdbc.HiveDriver;jdbc:hive:// > > If you use that on current master you get: > java.sql.SQLException: No suitable driver found for jdbc:hive:// > If you use hive2 it works fine. It's not clear to me if that's a change from > prior versions or not. > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17013) Delete request with a subquery based on select over a view
[ https://issues.apache.org/jira/browse/HIVE-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074743#comment-16074743 ] Frédéric ESCANDELL commented on HIVE-17013: --- I would like to complete information given in the description of the ticket : I'm using Hive 1.2.1000.2.6.0.3-8. I think this bug could come from the patch of this ticket https://issues.apache.org/jira/browse/HIVE-15970 et more particulary the snippet of code below : throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java] {code:java} /** * The suffix is always relative to a given ASTNode */ public DestClausePrefix getDestNamePrefix(ASTNode curNode) { assert curNode != null : "must supply curNode"; if(curNode.getType() != HiveParser.TOK_INSERT_INTO) { //select statement assert curNode.getType() == HiveParser.TOK_DESTINATION; if(operation == Operation.OTHER) { //not an 'interesting' op return DestClausePrefix.INSERT; } //if it is an 'interesting' op but it's a select it must be a sub-query or a derived table //it doesn't require a special Acid code path - the reset of the code here is to ensure //the tree structure is what we expect boolean thisIsInASubquery = false; parentLoop: while(curNode.getParent() != null) { curNode = (ASTNode) curNode.getParent(); switch (curNode.getType()) { case HiveParser.TOK_SUBQUERY_EXPR: //this is a real subquery (foo IN (select ...)) case HiveParser.TOK_SUBQUERY: //this is a Derived Table Select * from (select a from ...)) //strictly speaking SetOps should have a TOK_SUBQUERY parent so next 6 items are redundant case HiveParser.TOK_UNIONALL: case HiveParser.TOK_UNIONDISTINCT: case HiveParser.TOK_EXCEPTALL: case HiveParser.TOK_EXCEPTDISTINCT: case HiveParser.TOK_INTERSECTALL: case HiveParser.TOK_INTERSECTDISTINCT: thisIsInASubquery = true; break parentLoop; } } if(!thisIsInASubquery) { throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); } return DestClausePrefix.INSERT; } {code} > Delete request with a subquery based on select over a view > -- > > Key: HIVE-17013 > URL: https://issues.apache.org/jira/browse/HIVE-17013 > Project: Hive > Issue Type: Bug >Reporter: Frédéric ESCANDELL >Priority: Blocker > > Hi, > I based my DDL on this exemple > https://fr.hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/. > In a delete request, the use of a view in a subquery throw an exception : > FAILED: IllegalStateException Expected 'insert into table default.mydim > select ROW__ID from default.mydim sort by ROW__ID' to be in sub-query or set > operation. > {code} > {code:sql} > drop table if exists mydim; > create table mydim (key int, name string, zip string, is_current boolean) > clustered by(key) into 3 buckets > stored as orc tblproperties ('transactional'='true'); > insert into mydim values > (1, 'bob', '95136', true), > (2, 'joe', '70068', true), > (3, 'steve', '22150', true); > drop table if exists updates_staging_table; > create table updates_staging_table (key int, newzip string); > insert into updates_staging_table values (1, 87102), (3, 45220); > drop view if exists updates_staging_view; > create view updates_staging_view (key, newzip) as select key, newzip from > updates_staging_table; > delete from mydim > where mydim.key in (select key from updates_staging_view); > FAILED: IllegalStateException Expected 'insert into table default.mydim > select ROW__ID from default.mydim sort by ROW__ID' to be in sub-query or set > operation. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17013) Delete request with a subquery based on select over a view
[ https://issues.apache.org/jira/browse/HIVE-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074743#comment-16074743 ] Frédéric ESCANDELL edited comment on HIVE-17013 at 7/5/17 1:24 PM: --- I would like to complete information given in the description of the ticket : I'm using Hive 1.2.1000.2.6.0.3-8. I think this bug could come from the patch of this ticket https://issues.apache.org/jira/browse/HIVE-15970 and more particulary the snippet of code below : {code:java} throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); {code} [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java] {code:java} /** * The suffix is always relative to a given ASTNode */ public DestClausePrefix getDestNamePrefix(ASTNode curNode) { assert curNode != null : "must supply curNode"; if(curNode.getType() != HiveParser.TOK_INSERT_INTO) { //select statement assert curNode.getType() == HiveParser.TOK_DESTINATION; if(operation == Operation.OTHER) { //not an 'interesting' op return DestClausePrefix.INSERT; } //if it is an 'interesting' op but it's a select it must be a sub-query or a derived table //it doesn't require a special Acid code path - the reset of the code here is to ensure //the tree structure is what we expect boolean thisIsInASubquery = false; parentLoop: while(curNode.getParent() != null) { curNode = (ASTNode) curNode.getParent(); switch (curNode.getType()) { case HiveParser.TOK_SUBQUERY_EXPR: //this is a real subquery (foo IN (select ...)) case HiveParser.TOK_SUBQUERY: //this is a Derived Table Select * from (select a from ...)) //strictly speaking SetOps should have a TOK_SUBQUERY parent so next 6 items are redundant case HiveParser.TOK_UNIONALL: case HiveParser.TOK_UNIONDISTINCT: case HiveParser.TOK_EXCEPTALL: case HiveParser.TOK_EXCEPTDISTINCT: case HiveParser.TOK_INTERSECTALL: case HiveParser.TOK_INTERSECTDISTINCT: thisIsInASubquery = true; break parentLoop; } } if(!thisIsInASubquery) { throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); } return DestClausePrefix.INSERT; } {code} was (Author: fescandell): I would like to complete information given in the description of the ticket : I'm using Hive 1.2.1000.2.6.0.3-8. I think this bug could come from the patch of this ticket https://issues.apache.org/jira/browse/HIVE-15970 et more particulary the snippet of code below : throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java] {code:java} /** * The suffix is always relative to a given ASTNode */ public DestClausePrefix getDestNamePrefix(ASTNode curNode) { assert curNode != null : "must supply curNode"; if(curNode.getType() != HiveParser.TOK_INSERT_INTO) { //select statement assert curNode.getType() == HiveParser.TOK_DESTINATION; if(operation == Operation.OTHER) { //not an 'interesting' op return DestClausePrefix.INSERT; } //if it is an 'interesting' op but it's a select it must be a sub-query or a derived table //it doesn't require a special Acid code path - the reset of the code here is to ensure //the tree structure is what we expect boolean thisIsInASubquery = false; parentLoop: while(curNode.getParent() != null) { curNode = (ASTNode) curNode.getParent(); switch (curNode.getType()) { case HiveParser.TOK_SUBQUERY_EXPR: //this is a real subquery (foo IN (select ...)) case HiveParser.TOK_SUBQUERY: //this is a Derived Table Select * from (select a from ...)) //strictly speaking SetOps should have a TOK_SUBQUERY parent so next 6 items are redundant case HiveParser.TOK_UNIONALL: case HiveParser.TOK_UNIONDISTINCT: case HiveParser.TOK_EXCEPTALL: case HiveParser.TOK_EXCEPTDISTINCT: case HiveParser.TOK_INTERSECTALL: case HiveParser.TOK_INTERSECTDISTINCT: thisIsInASubquery = true; break parentLoop; } } if(!thisIsInASubquery) { throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to be in sub-query or set operation."); } return DestClausePrefix.INSERT; } {code} > Delete request with a subquery based on select over a view > -- > > Key:
[jira] [Updated] (HIVE-17038) invalid result when CAST-ing to DATE
[ https://issues.apache.org/jira/browse/HIVE-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Hopper updated HIVE-17038: -- Description: when casting incorrect date literals to DATE data type hive returns wrong values instead of NULL. {code} SELECT CAST('2017-02-31' AS DATE); SELECT CAST('2017-04-31' AS DATE); {code} was: when casting incorrect date literals to DATE data type hive returns wrong values instead of NULL. {code} SELECT CAST('2017-02-31' AS DATE); SELECT CAST('2017-05-31' AS DATE); {code} > invalid result when CAST-ing to DATE > > > Key: HIVE-17038 > URL: https://issues.apache.org/jira/browse/HIVE-17038 > Project: Hive > Issue Type: Bug > Components: CLI, Hive >Affects Versions: 1.2.1 >Reporter: Jim Hopper > > when casting incorrect date literals to DATE data type hive returns wrong > values instead of NULL. > {code} > SELECT CAST('2017-02-31' AS DATE); > SELECT CAST('2017-04-31' AS DATE); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17037) Extend join algorithm selection to avoid unnecessary input data shuffle
[ https://issues.apache.org/jira/browse/HIVE-17037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074723#comment-16074723 ] Hive QA commented on HIVE-17037: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12875736/HIVE-17037.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 10833 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partialsmbjoin] (batchId=28) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_smb_mapjoin_14] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_gby] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_gby] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_exists] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_not_in] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cross_product_check_1] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[jdbc_handler] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mrr] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_exists] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_nested_subquery] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_null_agg] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_select] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[table_access_keys_stats] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets4] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=98) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: