[jira] [Commented] (HIVE-13730) hybridgrace_hashjoin_1.q test gets stuck
[ https://issues.apache.org/jira/browse/HIVE-13730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282431#comment-15282431 ] Wei Zheng commented on HIVE-13730: -- Here's an todo item after HIVE-13755 is fixed. Right now memory manager doesn't guarantee to allocate enough memory for each table in n-way join case. After fixing that issue, this assert below can be put into HybridHashTableContainer's cstr after the variables have been determined. {code} assert writeBufferSize * (numPartitions - numPartitionsSpilledOnCreation) <= memoryThreshold : "hive.auto.convert.join.noconditionaltask.size is set too low. It's not enough to " + "allocate " + (numPartitions - numPartitionsSpilledOnCreation) + " partitions (each " + " of size " + writeBufferSize; {code} > hybridgrace_hashjoin_1.q test gets stuck > > > Key: HIVE-13730 > URL: https://issues.apache.org/jira/browse/HIVE-13730 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.1.0 >Reporter: Vikram Dixit K >Assignee: Wei Zheng >Priority: Blocker > Attachments: HIVE-13730.1.patch > > > I am seeing hybridgrace_hashjoin_1.q getting stuck on master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size
[ https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13751: - Attachment: HIVE-13751.2.patch [~jdere] Addressed your review comments in this patch. > LlapOutputFormatService should have a configurable send buffer size > --- > > Key: HIVE-13751 > URL: https://issues.apache.org/jira/browse/HIVE-13751 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13751.1.patch, HIVE-13751.2.patch > > > Netty channel buffer size is hard-coded 128KB now. It should be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282397#comment-15282397 ] Ashutosh Chauhan commented on HIVE-13743: - Thanks [~rajesh.balamohan] for verification. [~spena] can you take a quick look at the patch? > Data move codepath is broken with hive (2.1.0-SNAPSHOT) > --- > > Key: HIVE-13743 > URL: https://issues.apache.org/jira/browse/HIVE-13743 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13743.patch > > > Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop > 2.8.0-snapshot. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path > not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > at > org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333) > at org.apache.hadoop.ipc.Client.call(Client.java:1448) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy30.getEZForPath(Unknown > Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/ > ... > ... > ... > 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to > move source > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002 > to destination > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > {noformat} > https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836 > hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing > FileNotFoundException as the destf is not present yet. This causes moveFile > to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282359#comment-15282359 ] Naveen Gangam commented on HIVE-13749: -- I am still analyzing the heap, but appears like they are all stashed away in HashMap, perhaps in a threadlocal. I do not have the allocation stack for these objects so I cannot tell what part of the code creates these instances. Just running a simple query iteratively via beeline where it connects + disconnects every iteration, I observe the leak. Not sure if it is the same workload as in the environment where the heap dump was generated from. I am currently running some tests with a change to remove it from the threadlocal at https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L811 and https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L483 > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13728) TestHBaseSchemaTool fails on master
[ https://issues.apache.org/jira/browse/HIVE-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282352#comment-15282352 ] Hive QA commented on HIVE-13728: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803287/HIVE-13728.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 9916 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_transform.q-union_remove_7.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby4_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join37 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_test_outer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_transform_ppr1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_max org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_nested_udf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithValidPartVal org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithCommas org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithUnicode org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithValidCharacters org.apache.hadoop.hive.metastore.TestRemoteUGIHiveMetaStoreIpAddress.testIpAddress org.apache.hadoop.hive.ql.exec.tez.TestHostAffinitySplitLocationProvider.testOrcSplitsLocationAffinity org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testShowLocksFilterOptions org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener org.apache.hive.hcatalog.api.TestHCatClient.org.apache.hive.hcatalog.api.TestHCatClient org.apache.hive.hcatalog.api.repl.commands.TestCommands.org.apache.hive.hcatalog.api.repl.commands.TestCommands org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle
[jira] [Updated] (HIVE-13754) Fix resource leak in HiveClientCache
[ https://issues.apache.org/jira/browse/HIVE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13754: --- Attachment: HIVE-13754.patch HIVE-13754-branch-1.patch Attached patches for branch-1 and master. > Fix resource leak in HiveClientCache > > > Key: HIVE-13754 > URL: https://issues.apache.org/jira/browse/HIVE-13754 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13754-branch-1.patch, HIVE-13754.patch > > > Found that the {{users}} reference count can go into negative values, which > prevents {{tearDownIfUnused}} from closing the client connection when called. > This leads to a build up of clients which have been evicted from the cache, > are no longer in use, but have not been shutdown. > GC will eventually call {{finalize}}, which forcibly closes the connection > and cleans up the client, but I have seen as many as several hundred open > client connections as a result. > The main resource for this is caused by RetryingMetaStoreClient, which will > call {{reconnect}} on acquire, which calls {{close}}. This will decrement > {{users}} to -1 on the reconnect, then acquire will increase this to 0 while > using it, and back to -1 when it releases it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282303#comment-15282303 ] Thejas M Nair edited comment on HIVE-13708 at 5/13/16 12:59 AM: This change in current patch to CSVSerde breaks backward compatbility for anyone who had a scripted create table command with a non string column. Those statements would fail now. If we consider CSVSerde in isolation, the best thing to do about it is to address HIVE-13709, ie support other types as supported by LazySimpleSerde. That would lead to correct results and also be backward compatible. Regarding the generic change applicable to any such serde - It is a difficult choice between allowing logically incorrect results and backward compatibility. I think if we also make the changes in HIVE-13709, only users who use custom serde with same limitations (but without error checks) and also use unsupported types for that serde would be affected. That set is likely to be very small. I would vote for making this incompatible change and fix the logical correctness issue. was (Author: thejas): This change to CSVSerde breaks backward compatbility for anyone who had a scripted create table command with a non string column. Those statements would fail now. If we consider CSVSerde in isolation, the best thing to do about it is to address HIVE-13709, ie support other types as supported by LazySimpleSerde. That would lead to correct results and also be backward compatible. Regarding the generic change applicable to any such serde - It is a difficult choice between allowing logically incorrect results and backward compatibility. I think if we also make the changes in HIVE-13709, only users who use custom serde with same limitations (but without error checks) and also use unsupported types for that serde would be affected. That set is likely to be very small. I would vote for making this incompatible change and fix the logical correctness issue. > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282303#comment-15282303 ] Thejas M Nair commented on HIVE-13708: -- This change to CSVSerde breaks backward compatbility for anyone who had a scripted create table command with a non string column. Those statements would fail now. If we consider CSVSerde in isolation, the best thing to do about it is to address HIVE-13709, ie support other types as supported by LazySimpleSerde. That would lead to correct results and also be backward compatible. Regarding the generic change applicable to any such serde - It is a difficult choice between allowing logically incorrect results and backward compatibility. I think if we also make the changes in HIVE-13709, only users who use custom serde with same limitations (but without error checks) and also use unsupported types for that serde would be affected. That set is likely to be very small. I would vote for making this incompatible change and fix the logical correctness issue. > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282299#comment-15282299 ] Rajesh Balamohan commented on HIVE-13743: - [~ashutoshc] - Checked the patch in Hadoop 2.8 cluster and patch works as expected. No longer seeing this issue. > Data move codepath is broken with hive (2.1.0-SNAPSHOT) > --- > > Key: HIVE-13743 > URL: https://issues.apache.org/jira/browse/HIVE-13743 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13743.patch > > > Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop > 2.8.0-snapshot. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path > not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > at > org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333) > at org.apache.hadoop.ipc.Client.call(Client.java:1448) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy30.getEZForPath(Unknown > Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/ > ... > ... > ... > 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to > move source > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002 > to destination > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > {noformat} > https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836 > hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing > FileNotFoundException as the destf is not present yet. This causes moveFile > to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282291#comment-15282291 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- [~thejas] I checked whether we could do this in a generic way. As you mentioned, we can perform a deep check of the object inspector after initialize() and see if the types will match the column type in the table definition. My concern here is if it is backward compatible or will it break things that used to work previously. If we haven't enforced this rule previously, how will we expect the custom serde developer henceforth to know that this is an enforced rule in Hive. Also, it looked cleaner to implement this check in the actual serde itself (like for e.g. RegexSerDe has done a similar check in initialize()) since it seems that it is the responsibility of the Serde to interpret the data correctly and not the query processor. Let me know your feedback. Thanks Hari > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Attachment: HIVE-13753.3.patch Makes sense. Patch 3 made that field client final. Thanks [~vgumashta] for the review! > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch, > HIVE-13753.3.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282277#comment-15282277 ] Vaibhav Gumashta commented on HIVE-13753: - +1 pending tests. I would probably make the IMetaStoreClient member within the SynchronizedMetaStoreClient a final too. > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282269#comment-15282269 ] Thejas M Nair commented on HIVE-13708: -- [~hsubramaniyan] Can we do this in a generic way so that it is applicable to any serde that doesn't support the types being specified in create-table ? There are possibly other user created serdes that could also have this issue. I haven't looked deeper into the object inspectors. How about checking the objectinspector after initialize ? It seems like the types it will return can be determined from that. cc [~ashutoshc] I didn't mean this to be specifically about CSVSerde, but more general about the hive serde interaction. The specific change to that serde that I would like to see is in HIVE-13709 . > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282266#comment-15282266 ] Ashutosh Chauhan commented on HIVE-13708: - Patch is addressing different issue than the description. Would you like to update description of jira to reflect your patch? > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Attachment: HIVE-13753.2.patch Patch 2. > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282263#comment-15282263 ] Wei Zheng commented on HIVE-13753: -- Right, I should have removed that getter. Thanks for catching that. > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282262#comment-15282262 ] Vaibhav Gumashta commented on HIVE-13753: - [~wzheng] Is there a need to expose the underlying IMetaStoreClient object via SynchronizedMetaStoreClient? > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Attachment: HIVE-13753.1.patch Previous patch has wrong name. Corrected it. > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13753.1.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Attachment: (was: HIVE-13725.1.patch) > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13708: - Status: Patch Available (was: Open) > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size
[ https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282246#comment-15282246 ] Prasanth Jayachandran commented on HIVE-13751: -- [~jdere] I have tested this patch locally and this conf seems to work fine. Although I am seeing different set of errors possibly because of other errors. I will create bugs for them later. Can you please review this patch? > LlapOutputFormatService should have a configurable send buffer size > --- > > Key: HIVE-13751 > URL: https://issues.apache.org/jira/browse/HIVE-13751 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13751.1.patch > > > Netty channel buffer size is hard-coded 128KB now. It should be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size
[ https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13751: - Status: Patch Available (was: Open) > LlapOutputFormatService should have a configurable send buffer size > --- > > Key: HIVE-13751 > URL: https://issues.apache.org/jira/browse/HIVE-13751 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13751.1.patch > > > Netty channel buffer size is hard-coded 128KB now. It should be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-13708: - Attachment: HIVE-13708.1.patch cc [~ashutoshc] for review. > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Attachment: HIVE-13725.1.patch Upload patch 1. [~ekoifman] Can you review please? > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13725.1.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager
[ https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13753: - Status: Patch Available (was: Open) > Make metastore client thread safe in DbTxnManager > - > > Key: HIVE-13753 > URL: https://issues.apache.org/jira/browse/HIVE-13753 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13725.1.patch > > > The fact that multiple threads sharing the same metastore client which is > used for RPC to Thrift is not thread safe. > Race condition can happen when one sees "out of sequence response" error > message from Thrift server. That means the response from the Thrift server is > for a different request (by a different thread). > Solution will be to synchronize methods from the client side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11550) ACID queries pollute HiveConf
[ https://issues.apache.org/jira/browse/HIVE-11550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282233#comment-15282233 ] Alan Gates commented on HIVE-11550: --- +1 > ACID queries pollute HiveConf > - > > Key: HIVE-11550 > URL: https://issues.apache.org/jira/browse/HIVE-11550 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-11550.1.patch, HIVE-11550.patch > > > HiveConf is a SessionState level object. Some ACID related logic makes > changes to it (which are meant to be per query) but become per SessionState. > See SemanticAnalyzer.checkAcidConstraints() > Also note HiveConf.setVar(conf, > HiveConf.ConfVars.DYNAMICPARTITIONINGMODE, "nonstrict"); > in UpdateDeleteSemancitAnalzyer > [~alangates], do you know of other cases or ideas on how to deal with this > differently? > _SortedDynPartitionOptimizer.process()_ is the place to have the logic to do > _conf.setBoolVar(ConfVars.HIVEOPTSORTDYNAMICPARTITION, false);_ on per query > basis -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282227#comment-15282227 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: -- Couple of points : 1. The original document says : CREATE TABLE my_table(a string, b string, ...) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' I believe this supports strictly string columns and not anything else(not even variants like varchar). Please correct me if this is wrong. 2. The description for this jira says: CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with There is not much we can do for 'com.bizo.hive.serde.csv.CSVSerde' in Hive. I will upload a patch that will fix for 'org.apache.hadoop.hive.serde2.OpenCSVSerde'. Thanks Hari > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13742) Hive ptest has many failures due to metastore connection refused
[ https://issues.apache.org/jira/browse/HIVE-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282220#comment-15282220 ] Szehon Ho commented on HIVE-13742: -- There are potentially several tests running concurrently on the same Ptest slave, I haven't taken a close look and am not sure if it can cause corruption or not but just a thought. > Hive ptest has many failures due to metastore connection refused > > > Key: HIVE-13742 > URL: https://issues.apache.org/jira/browse/HIVE-13742 > Project: Hive > Issue Type: Bug >Reporter: Sergio Peña > Attachments: hive.log > > > The following exception is thrown on the Hive ptest with many tests, and it > is due to some Derby database issues: > {noformat} > 016-05-11T15:46:25,123 INFO [Thread-2[]]: metastore.HiveMetaStore > (HiveMetaStore.java:newRawStore(563)) - 0: Opening raw store with > implementation class:org.apache.hadoop.hive.metastore.ObjectStore > 2016-05-11T15:46:25,175 INFO [Thread-2[]]: metastore.ObjectStore > (ObjectStore.java:initialize(324)) - ObjectStore, initialize called > 2016-05-11T15:46:25,966 DEBUG [Thread-2[]]: bonecp.BoneCPDataSource > (BoneCPDataSource.java:getConnection(119)) - JDBC URL = > jdbc:derby:;databaseName=/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db;create=true, > Username = APP, partitions = 1, max (per partition) = 10, min (per > partition) = 0, idle max age = 60 min, idle test period = 240 min, strategy = > DEFAULT > 2016-05-11T15:46:26,003 ERROR [Thread-2[]]: Datastore.Schema > (Log4JLogger.java:error(125)) - Failed initialising database. > org.datanucleus.exceptions.NucleusDataStoreException: Unable to open a test > connection to the given database. JDBC url = > jdbc:derby:;databaseName=/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db;create=true, > username = APP. Terminating connection pool (set lazyInit to true if you > expect to start your database after your app). Original Exception: -- > java.sql.SQLException: Failed to create database > '/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db', > see the next exception for details. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source) > at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source) > at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source) > at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source) > at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source) > at org.apache.derby.jdbc.Driver20.connect(Unknown Source) > at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source) > at java.sql.DriverManager.getConnection(DriverManager.java:664) > at java.sql.DriverManager.getConnection(DriverManager.java:208) > at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361) > at com.jolbox.bonecp.BoneCP.(BoneCP.java:416) > at > com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120) > at > org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:483) > at > org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:296) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:408) > at > org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606) > at > org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) > at > org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133) > at > org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:420) > at > org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:821) > at > org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:338) > at >
[jira] [Assigned] (HIVE-13708) Create table should verify datatypes supported by the serde
[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan reassigned HIVE-13708: Assignee: Hari Sankar Sivarama Subramaniyan > Create table should verify datatypes supported by the serde > --- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Thejas M Nair >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Critical > > As [~Goldshuv] mentioned in HIVE-. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE- - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 11.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13562) Enable vector bridge for all non-vectorized udfs
[ https://issues.apache.org/jira/browse/HIVE-13562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-13562: --- Assignee: Matt McCline > Enable vector bridge for all non-vectorized udfs > > > Key: HIVE-13562 > URL: https://issues.apache.org/jira/browse/HIVE-13562 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Reporter: Ashutosh Chauhan >Assignee: Matt McCline > > Mechanism already exists for this via {{VectorUDFAdaptor}} but we have > arbitrarily hand picked few udfs to go through it. I think we should enable > this by default for all udfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13752) Mini HDFS Cluster fails to start on trunk
[ https://issues.apache.org/jira/browse/HIVE-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou resolved HIVE-13752. -- Resolution: Invalid Resolved this since it goes to wrong place due to JIRA maintenance. > Mini HDFS Cluster fails to start on trunk > - > > Key: HIVE-13752 > URL: https://issues.apache.org/jira/browse/HIVE-13752 > Project: Hive > Issue Type: Bug >Reporter: Xiaobing Zhou > > It's been noticed that Mini HDFS Cluster fails to start on trunk, blocking > unit tests and Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13682) EOFException with fast hashtable
[ https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13682: Attachment: HIVE-13682.01.patch > EOFException with fast hashtable > > > Key: HIVE-13682 > URL: https://issues.apache.org/jira/browse/HIVE-13682 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Matt McCline > Attachments: HIVE-13682.01.patch > > > While testing something else on recent master, w/Tez 0.8.3, this happened > (TPCDS q27) > {noformat} > Caused by: java.util.concurrent.ExecutionException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399) > ... 20 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131) > ... 4 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) > at > org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) > ... 9 more > {noformat} > There's no error if fast hashtable is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13682) EOFException with fast hashtable
[ https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13682: Attachment: (was: HIVE-13682.01.patch) > EOFException with fast hashtable > > > Key: HIVE-13682 > URL: https://issues.apache.org/jira/browse/HIVE-13682 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Matt McCline > > While testing something else on recent master, w/Tez 0.8.3, this happened > (TPCDS q27) > {noformat} > Caused by: java.util.concurrent.ExecutionException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399) > ... 20 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131) > ... 4 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) > at > org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) > ... 9 more > {noformat} > There's no error if fast hashtable is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13752) Mini HDFS Cluster fails to start on trunk
[ https://issues.apache.org/jira/browse/HIVE-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282113#comment-15282113 ] Xiaobing Zhou commented on HIVE-13752: -- Here's the expcetion: {noformat} Running org.apache.hadoop.hdfs.TestAsyncDFSRename Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 15.756 sec <<< FAILURE! - in org.apache.hadoop.hdfs.TestAsyncDFSRename testAsyncRenameWithOverwrite(org.apache.hadoop.hdfs.TestAsyncDFSRename) Time elapsed: 15.58 sec <<< ERROR! java.io.IOException: Timed out waiting for Mini HDFS Cluster to start at org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1345) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:848) at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441) at org.apache.hadoop.hdfs.TestAsyncDFSRename.testAsyncRenameWithOverwrite(TestAsyncDFSRename.java:69) {noformat} > Mini HDFS Cluster fails to start on trunk > - > > Key: HIVE-13752 > URL: https://issues.apache.org/jira/browse/HIVE-13752 > Project: Hive > Issue Type: Bug >Reporter: Xiaobing Zhou > > It's been noticed that Mini HDFS Cluster fails to start on trunk, blocking > unit tests and Jenkins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13727) Getting error Failed rule: 'orderByClause clusterByClause distributeByClause sortByClause limitClause can only be applied to the whole union.' in subquery
[ https://issues.apache.org/jira/browse/HIVE-13727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282087#comment-15282087 ] Ashutosh Chauhan commented on HIVE-13727: - [~prongs] Was this query used to work prior to HIVE-9039 commit ? > Getting error Failed rule: 'orderByClause clusterByClause distributeByClause > sortByClause limitClause can only be applied to the whole union.' in subquery > --- > > Key: HIVE-13727 > URL: https://issues.apache.org/jira/browse/HIVE-13727 > Project: Hive > Issue Type: Bug >Reporter: Rajat Khandelwal > > The error comes in the following query: > {noformat} > SELECT * > FROM > (SELECT * >FROM srcpart a >WHERE a.ds = '2008-04-08' > AND a.hr = '11' >ORDER BY a.key LIMIT 5 >UNION ALL >SELECT * >FROM srcpart b >WHERE b.ds = '2008-04-08' > AND b.hr = '14' >ORDER BY b.key LIMIT 5) subq > ORDER BY KEY LIMIT 5 > {noformat} > But the following query works: > {noformat} > SELECT * > FROM > (SELECT * >FROM > (SELECT * > FROM srcpart a > WHERE a.ds = '2008-04-08' > AND a.hr = '11' > ORDER BY a.key LIMIT 5) pa >UNION ALL SELECT * >FROM > (SELECT * > FROM srcpart b > WHERE b.ds = '2008-04-08' > AND b.hr = '14' > ORDER BY b.key LIMIT 5) pb) subq > ORDER BY KEY LIMIT 5 > {noformat} > The queries are logically identical, the query that's rejected has dummy > select * clauses around the sub-queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11947) mssql upgrade scripts contains invalid character
[ https://issues.apache.org/jira/browse/HIVE-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282060#comment-15282060 ] Pengcheng Xiong commented on HIVE-11947: +1 > mssql upgrade scripts contains invalid character > > > Key: HIVE-11947 > URL: https://issues.apache.org/jira/browse/HIVE-11947 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.2.0, 1.1.0 >Reporter: Huan Huang >Assignee: Huan Huang > Attachments: HIVE-11947.patch > > > upgrade scripts dont execute as a result -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13608) We should provide better error message while constraints with duplicate names are created
[ https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282045#comment-15282045 ] Ashutosh Chauhan commented on HIVE-13608: - +1 pending tests. > We should provide better error message while constraints with duplicate names > are created > - > > Key: HIVE-13608 > URL: https://issues.apache.org/jira/browse/HIVE-13608 > Project: Hive > Issue Type: Bug > Components: Diagnosability, Metastore >Affects Versions: 2.0.0 >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, > HIVE-13608.3.patch > > > {code} > PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t1 > POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) > disable novalidate) > POSTHOOK: type: CREATETABLE > POSTHOOK: Output: database:default > POSTHOOK: Output: default@t1 > PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct > MetaStore DB connections, we don't support retries at the client level.) > {code} > In the above case, it seems like useful error message is lost. It looks like > a generic problem with metastore server/client exception handling and > message propagation. Seems like exception parsing logic of > RetryingMetaStoreClient::invoke() needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created
[ https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13608: Affects Version/s: 2.0.0 > We should provide better error message while constraints with duplicate names > are created > - > > Key: HIVE-13608 > URL: https://issues.apache.org/jira/browse/HIVE-13608 > Project: Hive > Issue Type: Bug > Components: Diagnosability, Metastore >Affects Versions: 2.0.0 >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, > HIVE-13608.3.patch > > > {code} > PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t1 > POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) > disable novalidate) > POSTHOOK: type: CREATETABLE > POSTHOOK: Output: database:default > POSTHOOK: Output: default@t1 > PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct > MetaStore DB connections, we don't support retries at the client level.) > {code} > In the above case, it seems like useful error message is lost. It looks like > a generic problem with metastore server/client exception handling and > message propagation. Seems like exception parsing logic of > RetryingMetaStoreClient::invoke() needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created
[ https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13608: Target Version/s: 2.1.0 > We should provide better error message while constraints with duplicate names > are created > - > > Key: HIVE-13608 > URL: https://issues.apache.org/jira/browse/HIVE-13608 > Project: Hive > Issue Type: Bug > Components: Diagnosability, Metastore >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, > HIVE-13608.3.patch > > > {code} > PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t1 > POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) > disable novalidate) > POSTHOOK: type: CREATETABLE > POSTHOOK: Output: database:default > POSTHOOK: Output: default@t1 > PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct > MetaStore DB connections, we don't support retries at the client level.) > {code} > In the above case, it seems like useful error message is lost. It looks like > a generic problem with metastore server/client exception handling and > message propagation. Seems like exception parsing logic of > RetryingMetaStoreClient::invoke() needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats
[ https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13269: Target Version/s: 2.1.0 > Simplify comparison expressions using column stats > -- > > Key: HIVE-13269 > URL: https://issues.apache.org/jira/browse/HIVE-13269 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, > HIVE-13269.patch, HIVE-13269.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13068) Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II
[ https://issues.apache.org/jira/browse/HIVE-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13068: Target Version/s: 2.1.0 > Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II > --- > > Key: HIVE-13068 > URL: https://issues.apache.org/jira/browse/HIVE-13068 > Project: Hive > Issue Type: Sub-task > Components: CBO, Logical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13068.01.patch, HIVE-13068.01.patch, > HIVE-13068.02.patch, HIVE-13068.03.patch, HIVE-13068.patch > > > After HIVE-12543 went in, we need follow-up work to disable the last call to > ConstantPropagate in Hive. This probably implies work on extending the > constant folding logic in Calcite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created
[ https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13608: Component/s: Metastore Diagnosability > We should provide better error message while constraints with duplicate names > are created > - > > Key: HIVE-13608 > URL: https://issues.apache.org/jira/browse/HIVE-13608 > Project: Hive > Issue Type: Bug > Components: Diagnosability, Metastore >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, > HIVE-13608.3.patch > > > {code} > PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t1 > POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) > disable novalidate) > POSTHOOK: type: CREATETABLE > POSTHOOK: Output: database:default > POSTHOOK: Output: default@t1 > PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable > novalidate) > PREHOOK: type: CREATETABLE > PREHOOK: Output: database:default > PREHOOK: Output: default@t2 > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct > MetaStore DB connections, we don't support retries at the client level.) > {code} > In the above case, it seems like useful error message is lost. It looks like > a generic problem with metastore server/client exception handling and > message propagation. Seems like exception parsing logic of > RetryingMetaStoreClient::invoke() needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13750: Target Version/s: 2.1.0 > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13602) TPCH q16 return wrong result when CBO is on
[ https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13602: Affects Version/s: (was: 1.3.0) Target Version/s: 2.1.0 > TPCH q16 return wrong result when CBO is on > --- > > Key: HIVE-13602 > URL: https://issues.apache.org/jira/browse/HIVE-13602 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 2.0.0, 1.2.2 >Reporter: Nemon Lou >Assignee: Pengcheng Xiong > Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, > HIVE-13602.04.patch, HIVE-13602.05.patch, calcite_cbo_bad.out, > calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, > explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, > explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out > > > Running tpch with factor 2, > q16 returns 1,160 rows when CBO is on, > while returns 24,581 rows when CBO is off. > See attachment for detail . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11160) Auto-gather column stats
[ https://issues.apache.org/jira/browse/HIVE-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11160: Target Version/s: 2.1.0 > Auto-gather column stats > > > Key: HIVE-11160 > URL: https://issues.apache.org/jira/browse/HIVE-11160 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11160.01.patch, HIVE-11160.02.patch, > HIVE-11160.03.patch, HIVE-11160.04.patch, HIVE-11160.05.patch, > HIVE-11160.06.patch, HIVE-11160.07.patch, HIVE-11160.08.patch, > HIVE-11160.09.patch > > > Hive will collect table stats when set hive.stats.autogather=true during the > INSERT OVERWRITE command. And then the users need to collect the column stats > themselves using "Analyze" command. In this patch, the column stats will also > be collected automatically. More specifically, INSERT OVERWRITE will > automatically create new column stats. INSERT INTO will automatically merge > new column stats with existing ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13567) create ColumnStatsAutoGatherContext
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13567: Target Version/s: 2.1.0 > create ColumnStatsAutoGatherContext > --- > > Key: HIVE-13567 > URL: https://issues.apache.org/jira/browse/HIVE-13567 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13566) enable merging of bit vectors for insert into
[ https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13566: Target Version/s: 2.1.0 > enable merging of bit vectors for insert into > - > > Key: HIVE-13566 > URL: https://issues.apache.org/jira/browse/HIVE-13566 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13249: - Attachment: HIVE-13249.10.patch patch 10 for review > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.10.patch, > HIVE-13249.2.patch, HIVE-13249.3.patch, HIVE-13249.4.patch, > HIVE-13249.5.patch, HIVE-13249.6.patch, HIVE-13249.7.patch, > HIVE-13249.8.patch, HIVE-13249.9.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13662) Set file permission and ACL in file sink operator
[ https://issues.apache.org/jira/browse/HIVE-13662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-13662: -- Assignee: Pengcheng Xiong > Set file permission and ACL in file sink operator > - > > Key: HIVE-13662 > URL: https://issues.apache.org/jira/browse/HIVE-13662 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Pengcheng Xiong > > As suggested > [here|https://issues.apache.org/jira/browse/HIVE-13572?focusedCommentId=15254438=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15254438]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13662) Set file permission and ACL in file sink operator
[ https://issues.apache.org/jira/browse/HIVE-13662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281997#comment-15281997 ] Pengcheng Xiong commented on HIVE-13662: assigned to myself as per [~ashutoshc]'s request. :) > Set file permission and ACL in file sink operator > - > > Key: HIVE-13662 > URL: https://issues.apache.org/jira/browse/HIVE-13662 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Pengcheng Xiong > > As suggested > [here|https://issues.apache.org/jira/browse/HIVE-13572?focusedCommentId=15254438=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15254438]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281995#comment-15281995 ] Wei Zheng commented on HIVE-13249: -- The Exception being caught from startHouseKeeperService is thrown by the class instantiation: {code} openTxnsCounter = (HouseKeeperService)c.newInstance(); {code} But I think I can move this line into the try block, and make this method not throw exception, so that we can best effort. I don't think we want to fail all txns just because we cannot start the counter service. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function
[ https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281948#comment-15281948 ] Hive QA commented on HIVE-13453: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803278/HIVE-13453.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 104 failed/errored test(s), 9195 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-bucket_map_join_tez1.q-auto_sortmerge_join_16.q-skewjoin.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more - did not produce a TEST-*.xml file TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more - did not produce a TEST-*.xml file TestSparkCliDriver-load_dyn_part5.q-load_dyn_part2.q-skewjoinopt16.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf_stats_opt org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_windowing org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_char org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_varchar org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_createas1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union36 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_type_chk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_timestamp_funcs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_gby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_order_null org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_decimal org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_type_chk org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_7
[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size
[ https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13751: - Attachment: HIVE-13751.1.patch > LlapOutputFormatService should have a configurable send buffer size > --- > > Key: HIVE-13751 > URL: https://issues.apache.org/jira/browse/HIVE-13751 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13751.1.patch > > > Netty channel buffer size is hard-coded 128KB now. It should be made > configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Attachment: HIVE-13750.patch > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13750.patch > > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281918#comment-15281918 ] Thejas M Nair edited comment on HIVE-13749 at 5/12/16 6:59 PM: --- What is retaining them as per MAT ? was (Author: thejas): Any ideas to what is causing them to be retained ? > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13750: --- Status: Patch Available (was: In Progress) > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible
[ https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13750 started by Jesus Camacho Rodriguez. -- > Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer > when possible > -- > > Key: HIVE-13750 > URL: https://issues.apache.org/jira/browse/HIVE-13750 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Extend ReduceDedup to remove additional shuffle stage created by sorted > dynamic partition optimizer when possible, thus avoiding unnecessary work. > By [~ashutoshc]: > {quote} > Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) > unconditionally adds an extra shuffle stage. If sort columns of previous > shuffle and partitioning columns of table match, reduce sink deduplication > optimizer removes extra shuffle stage, thus bringing down overhead to zero. > However, if they don’t match, we end up doing extra shuffle. This can be > improved since we can add table partition columns as a sort columns on > earlier shuffle and avoid this extra shuffle. This ensures that in cases > query already has a shuffle stage, we are not shuffling data again. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281918#comment-15281918 ] Thejas M Nair commented on HIVE-13749: -- Any ideas to what is causing them to be retained ? > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13743: Attachment: HIVE-13743.patch This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent path in 2.6, now throws FNFE. [~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't return unit test for this since Hive currently uses 2.6 hadoop/ > Data move codepath is broken with hive (2.1.0-SNAPSHOT) > --- > > Key: HIVE-13743 > URL: https://issues.apache.org/jira/browse/HIVE-13743 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan > Attachments: HIVE-13743.patch > > > Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop > 2.8.0-snapshot. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path > not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > at > org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333) > at org.apache.hadoop.ipc.Client.call(Client.java:1448) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy30.getEZForPath(Unknown > Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/ > ... > ... > ... > 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to > move source > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002 > to destination > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > {noformat} > https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836 > hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing > FileNotFoundException as the destf is not present yet. This causes moveFile > to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13743: Assignee: Ashutosh Chauhan Status: Patch Available (was: Open) > Data move codepath is broken with hive (2.1.0-SNAPSHOT) > --- > > Key: HIVE-13743 > URL: https://issues.apache.org/jira/browse/HIVE-13743 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13743.patch > > > Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop > 2.8.0-snapshot. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path > not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > at > org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333) > at org.apache.hadoop.ipc.Client.call(Client.java:1448) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy30.getEZForPath(Unknown > Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/ > ... > ... > ... > 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to > move source > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002 > to destination > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > {noformat} > https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836 > hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing > FileNotFoundException as the destf is not present yet. This causes moveFile > to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281908#comment-15281908 ] Ashutosh Chauhan edited comment on HIVE-13743 at 5/12/16 6:50 PM: -- This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent path in 2.6, now throws FNFE. [~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't write unit test for this since Hive currently uses 2.6 hadoop was (Author: ashutoshc): This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent path in 2.6, now throws FNFE. [~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't return unit test for this since Hive currently uses 2.6 hadoop/ > Data move codepath is broken with hive (2.1.0-SNAPSHOT) > --- > > Key: HIVE-13743 > URL: https://issues.apache.org/jira/browse/HIVE-13743 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13743.patch > > > Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop > 2.8.0-snapshot. > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path > not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > at > org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333) > at org.apache.hadoop.ipc.Client.call(Client.java:1448) > at org.apache.hadoop.ipc.Client.call(Client.java:1385) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy30.getEZForPath(Unknown > Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/ > ... > ... > ... > 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to > move source > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002 > to destination > hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1 > {noformat} > https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836 > hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing > FileNotFoundException as the destf is not present yet. This causes moveFile > to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281880#comment-15281880 ] Eugene Koifman commented on HIVE-13249: --- what I meant is {noformat} public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException { 384 if (openTxnsCounter == null) { synchronzied(TxnHandler.class) { 385 try { if (openTxnsCounter == null) { 386 startHouseKeeperService(conf, Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService")); } 387 } catch (Exception e) { 388 throw new MetaException(e.getMessage()); 389 } } 390 } {noformat} http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html One more thing: startHouseKeeperService() catches Exception and logs but when openTxns() calls startHouseKeeperService() it catches and rethrows. Seems contradictory. Did you want to fail all txns if this service is not available or make best effort? > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13249: - Attachment: HIVE-13249.9.patch Thanks for catching that. Patch 9 fixed it. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281868#comment-15281868 ] Eugene Koifman commented on HIVE-13249: --- {noformat} public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException { 384 if (openTxnsCounter == null) { 385 try { 386 startHouseKeeperService(conf, Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService")); 387 } catch (Exception e) { 388 throw new MetaException(e.getMessage()); 389 } 390 } {noformat} this is not thread safe. concurrent openTxns() can create multiple instances of AcidOpenTxnsCounterService {noformat} public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException { 384 if (openTxnsCounter == null) { synchronzied(TxnHandler.class) { 385 try { 386 startHouseKeeperService(conf, Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService")); 387 } catch (Exception e) { 388 throw new MetaException(e.getMessage()); 389 } } 390 } {noformat} would work > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11417) Create shims for the row by row read path that is backed by VectorizedRowBatch
[ https://issues.apache.org/jira/browse/HIVE-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281850#comment-15281850 ] Prasanth Jayachandran commented on HIVE-11417: -- Changes lgtm, +1 > Create shims for the row by row read path that is backed by VectorizedRowBatch > -- > > Key: HIVE-11417 > URL: https://issues.apache.org/jira/browse/HIVE-11417 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.1.0 > > Attachments: HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, > HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch > > > I'd like to make the default path for reading and writing ORC files to be > vectorized. To ensure that Hive can still read row by row, we'll need shims > to support the old API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13249: - Attachment: HIVE-13249.8.patch Upload patch 8. Made maxOpenTxns, numOpenTxns, tooManyOpenTxns volatile; Changed LOG.warn to LOG.error; Removed OpenTxnsCounter from MUTEX_KEY; Moved OpenTxnsCounter housekeeper service startup logic from HiveMetaStore to TxnHandler.openTxns. [~ekoifman] Could you please review? > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13621) compute stats in certain cases fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13621: --- Fix Version/s: 2.1.0 > compute stats in certain cases fails with NPE > - > > Key: HIVE-13621 > URL: https://issues.apache.org/jira/browse/HIVE-13621 > Project: Hive > Issue Type: Bug > Components: HBase Metastore, Metastore >Affects Versions: 2.1.0, 2.0.1 >Reporter: Vikram Dixit K >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch > > > {code} > FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13621) compute stats in certain cases fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13621: --- Resolution: Fixed Status: Resolved (was: Patch Available) > compute stats in certain cases fails with NPE > - > > Key: HIVE-13621 > URL: https://issues.apache.org/jira/browse/HIVE-13621 > Project: Hive > Issue Type: Bug > Components: HBase Metastore, Metastore >Affects Versions: 2.1.0, 2.0.1 >Reporter: Vikram Dixit K >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch > > > {code} > FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13621) compute stats in certain cases fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281785#comment-15281785 ] Pengcheng Xiong commented on HIVE-13621: reran all those Spark tests, none of them fail. Also check the other failures and they are unrelated. Pushed to master. Thanks [~vikram.dixit] and [~hagleitn] for the review and comments! > compute stats in certain cases fails with NPE > - > > Key: HIVE-13621 > URL: https://issues.apache.org/jira/browse/HIVE-13621 > Project: Hive > Issue Type: Bug > Components: HBase Metastore, Metastore >Affects Versions: 2.1.0, 2.0.1 >Reporter: Vikram Dixit K >Assignee: Pengcheng Xiong > Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch > > > {code} > FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11417) Create shims for the row by row read path that is backed by VectorizedRowBatch
[ https://issues.apache.org/jira/browse/HIVE-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-11417: - Attachment: HIVE-11417.patch This patch: * addresses the review comments from Prasanth * fixes a test failure where the schema evolution code from HIVE-13178 didn't work properly for vectorized binary -> string conversion. Note that jenkins doesn't seem to be handling the binary file orc-file-11-format.orc even though git included it in the patch as a binary diff, which explains the test failures that mention version 11 orc file. > Create shims for the row by row read path that is backed by VectorizedRowBatch > -- > > Key: HIVE-11417 > URL: https://issues.apache.org/jira/browse/HIVE-11417 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.1.0 > > Attachments: HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, > HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch > > > I'd like to make the default path for reading and writing ORC files to be > vectorized. To ensure that Hive can still read row by row, we'll need shims > to support the old API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13563) Hive Streaming does not honor orc.compress.size and orc.stripe.size table properties
[ https://issues.apache.org/jira/browse/HIVE-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281758#comment-15281758 ] Owen O'Malley commented on HIVE-13563: -- I think the ratio is better at 8 since we have the minor compaction set to run when there are 10 deltas, so 8 is a better match to the increase in work load. How about: ratio: 8 base compression 128k stripe 128mb delta compression 16k stripe 16mb > Hive Streaming does not honor orc.compress.size and orc.stripe.size table > properties > > > Key: HIVE-13563 > URL: https://issues.apache.org/jira/browse/HIVE-13563 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Labels: TODOC2.1 > Attachments: HIVE-13563.1.patch > > > According to the doc: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-HiveQLSyntax > One should be able to specify tblproperties for many ORC options. > But the settings for orc.compress.size and orc.stripe.size don't take effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore
[ https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-13749: - Attachment: Top_Consumers7.html > Memory leak in Hive Metastore > - > > Key: HIVE-13749 > URL: https://issues.apache.org/jira/browse/HIVE-13749 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: Top_Consumers7.html > > > Looking a heap dump of 10GB, a large number of Configuration objects(> 66k > instances) are being retained. These objects along with its retained set is > occupying about 95% of the heap space. This leads to HMS crashes every few > days. > I will attach an exported snapshot from the eclipse MAT. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13697) ListBucketing feature does not support uppercase string.
[ https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-13697: Status: Patch Available (was: In Progress) ROOT-CAUSE: toLowerCase() operator while getting skewed values from AST Node in BaseSemanticAnalyzer. Hence Skewed Values are stored lower case only. {code} hive> desc formatted testskew2; OK # col_namedata_type comment id int a string # Detailed Table Information Database: default Owner: hdfs CreateTime: Thu May 12 18:37:20 EEST 2016 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs:/user/hive/warehouse/testskew2 Table Type: MANAGED_TABLE Table Parameters: transient_lastDdlTime1463067440 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat:org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Stored As SubDirectories:Yes Skewed Columns: [a] Skewed Values: [[aus], [us]] < !!! ERROR !!! Storage Desc Params: serialization.format1 {code} SOLUTION: Remove unnecessary toLowerCase() operator. > ListBucketing feature does not support uppercase string. > > > Key: HIVE-13697 > URL: https://issues.apache.org/jira/browse/HIVE-13697 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 1.2.1 > Environment: 1.2.1 >Reporter: Hao Zhu >Assignee: Oleksiy Sayankin >Priority: Critical > Attachments: HIVE-13697.1.patch > > > This is the feature: > https://cwiki.apache.org/confluence/display/Hive/ListBucketing > 1. Good example: > {code} > CREATE TABLE testskew (id INT, a STRING) > SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew > SELECT 123,'abc' FROM dual > union all > SELECT 123,'xyz' FROM dual > union all > SELECT 123,'others' FROM dual; > {code} > {code} > # hadoop fs -ls /user/hive/warehouse/testskew > Found 3 items > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=abc > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=xyz > {code} > This is good, because both "abc" and "xyz" directories got created. > 2. Bad example -- This is the issue > {code} > CREATE TABLE testskew2 (id INT, a STRING) > SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew2 > SELECT 123, 'aus' FROM dual > union all > SELECT 123, 'US' FROM dual > union all > SELECT 123, 'others' FROM dual; > {code} > You can see, only "aus" directory got created... > {code} > # hadoop fs -ls /user/hive/warehouse/testskew2 > Found 2 items > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/a=aus > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13697) ListBucketing feature does not support uppercase string.
[ https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-13697: Attachment: HIVE-13697.1.patch > ListBucketing feature does not support uppercase string. > > > Key: HIVE-13697 > URL: https://issues.apache.org/jira/browse/HIVE-13697 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 1.2.1 > Environment: 1.2.1 >Reporter: Hao Zhu >Assignee: Oleksiy Sayankin >Priority: Critical > Attachments: HIVE-13697.1.patch > > > This is the feature: > https://cwiki.apache.org/confluence/display/Hive/ListBucketing > 1. Good example: > {code} > CREATE TABLE testskew (id INT, a STRING) > SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew > SELECT 123,'abc' FROM dual > union all > SELECT 123,'xyz' FROM dual > union all > SELECT 123,'others' FROM dual; > {code} > {code} > # hadoop fs -ls /user/hive/warehouse/testskew > Found 3 items > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=abc > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=xyz > {code} > This is good, because both "abc" and "xyz" directories got created. > 2. Bad example -- This is the issue > {code} > CREATE TABLE testskew2 (id INT, a STRING) > SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew2 > SELECT 123, 'aus' FROM dual > union all > SELECT 123, 'US' FROM dual > union all > SELECT 123, 'others' FROM dual; > {code} > You can see, only "aus" directory got created... > {code} > # hadoop fs -ls /user/hive/warehouse/testskew2 > Found 2 items > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/a=aus > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13697) ListBucketing feature does not support uppercase string.
[ https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13697 started by Oleksiy Sayankin. --- > ListBucketing feature does not support uppercase string. > > > Key: HIVE-13697 > URL: https://issues.apache.org/jira/browse/HIVE-13697 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 1.2.1 > Environment: 1.2.1 >Reporter: Hao Zhu >Assignee: Oleksiy Sayankin >Priority: Critical > > This is the feature: > https://cwiki.apache.org/confluence/display/Hive/ListBucketing > 1. Good example: > {code} > CREATE TABLE testskew (id INT, a STRING) > SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew > SELECT 123,'abc' FROM dual > union all > SELECT 123,'xyz' FROM dual > union all > SELECT 123,'others' FROM dual; > {code} > {code} > # hadoop fs -ls /user/hive/warehouse/testskew > Found 3 items > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=abc > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=xyz > {code} > This is good, because both "abc" and "xyz" directories got created. > 2. Bad example -- This is the issue > {code} > CREATE TABLE testskew2 (id INT, a STRING) > SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew2 > SELECT 123, 'aus' FROM dual > union all > SELECT 123, 'US' FROM dual > union all > SELECT 123, 'others' FROM dual; > {code} > You can see, only "aus" directory got created... > {code} > # hadoop fs -ls /user/hive/warehouse/testskew2 > Found 2 items > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/a=aus > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9615) Provide limit context for storage handlers
[ https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281674#comment-15281674 ] Hive QA commented on HIVE-9615: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12700308/HIVE-9615.2.patch.txt {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/245/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/245/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-245/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-245/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 38797d2..64c96e1 master -> origin/master + git reset --hard HEAD HEAD is now at 38797d2 HIVE-13670 : Improve Beeline connect/reconnect semantics (Sushanth Sowmyan, reviewed by Thejas Nair) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregatePullUpConstantsRule.java Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveProjectFilterPullUpConstantsRule.java + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 64c96e1 HIVE-13726 : Improve dynamic partition loading VI (Ashutosh Chauhan via Rui Li) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12700308 - PreCommit-HIVE-MASTER-Build > Provide limit context for storage handlers > -- > > Key: HIVE-9615 > URL: https://issues.apache.org/jira/browse/HIVE-9615 > Project: Hive > Issue Type: Improvement > Components: StorageHandler >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-9615.1.patch.txt, HIVE-9615.2.patch.txt > > > Propagate limit context generated from GlobalLimitOptimizer to storage > handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13068) Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II
[ https://issues.apache.org/jira/browse/HIVE-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281666#comment-15281666 ] Hive QA commented on HIVE-13068: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803199/HIVE-13068.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 187 failed/errored test(s), 9983 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-auto_sortmerge_join_7.q-tez_union_group_by.q-orc_merge9.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_subq_not_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udaf_percentile_approx_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_colstats_all_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropWhen org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_semijoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join_merge org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_masking_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_recursive_dir org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
[jira] [Updated] (HIVE-13726) Improve dynamic partition loading VI
[ https://issues.apache.org/jira/browse/HIVE-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13726: Fix Version/s: 2.1.0 > Improve dynamic partition loading VI > > > Key: HIVE-13726 > URL: https://issues.apache.org/jira/browse/HIVE-13726 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0, 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 2.1.0 > > Attachments: HIVE-13726.2.patch, HIVE-13726.patch > > > Parallelize deletes and other refactoring. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13726) Improve dynamic partition loading VI
[ https://issues.apache.org/jira/browse/HIVE-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-13726: Resolution: Fixed Status: Resolved (was: Patch Available) Did a QA run. No new failures because of the patch. Pushed to master. Thanks, Rui for the review. > Improve dynamic partition loading VI > > > Key: HIVE-13726 > URL: https://issues.apache.org/jira/browse/HIVE-13726 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Affects Versions: 1.2.0, 2.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13726.2.patch, HIVE-13726.patch > > > Parallelize deletes and other refactoring. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13507) Improved logging for ptest
[ https://issues.apache.org/jira/browse/HIVE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281584#comment-15281584 ] Sergio Peña commented on HIVE-13507: Those tests are not failing in the last completed build. I don't know what happened, but I'll continue watching the builds. > Improved logging for ptest > -- > > Key: HIVE-13507 > URL: https://issues.apache.org/jira/browse/HIVE-13507 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Fix For: 2.1.0 > > Attachments: HIVE-13507.01.patch, HIVE-13507.02.patch > > > NO PRECOMMIT TESTS > Include information about batch runtimes, outlier lists, host completion > times, etc. Try identifying tests which cause the build to take a long time > while holding onto resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler
[ https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Kuhnert updated HIVE-13696: -- Status: Patch Available (was: Open) > Monitor fair-scheduler.xml and automatically update/validate jobs submitted > to fair-scheduler > - > > Key: HIVE-13696 > URL: https://issues.apache.org/jira/browse/HIVE-13696 > Project: Hive > Issue Type: Improvement >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, > HIVE-13696.06.patch > > > Ensure that jobs are placed into the correct queue according to > {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and > users should not be able to submit jobs to queues they do not have access to. > This patch builds on the existing functionality in {{FairSchedulerShim}} to > route jobs to user-specific queue based on {{fair-scheduler.xml}} > configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In > addition to configuring job routing at session connect (current behavior), > the routing is validated per submission to yarn (when impersonation is off). > A {{FileSystemWatcher}} class is included to monitor changes in the > {{fair-scheduler.xml}} file (so updates are automatically reloaded when the > file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler
[ https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Kuhnert updated HIVE-13696: -- Attachment: HIVE-13696.06.patch > Monitor fair-scheduler.xml and automatically update/validate jobs submitted > to fair-scheduler > - > > Key: HIVE-13696 > URL: https://issues.apache.org/jira/browse/HIVE-13696 > Project: Hive > Issue Type: Improvement >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, > HIVE-13696.06.patch > > > Ensure that jobs are placed into the correct queue according to > {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and > users should not be able to submit jobs to queues they do not have access to. > This patch builds on the existing functionality in {{FairSchedulerShim}} to > route jobs to user-specific queue based on {{fair-scheduler.xml}} > configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In > addition to configuring job routing at session connect (current behavior), > the routing is validated per submission to yarn (when impersonation is off). > A {{FileSystemWatcher}} class is included to monitor changes in the > {{fair-scheduler.xml}} file (so updates are automatically reloaded when the > file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler
[ https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Kuhnert updated HIVE-13696: -- Status: Open (was: Patch Available) > Monitor fair-scheduler.xml and automatically update/validate jobs submitted > to fair-scheduler > - > > Key: HIVE-13696 > URL: https://issues.apache.org/jira/browse/HIVE-13696 > Project: Hive > Issue Type: Improvement >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, > HIVE-13696.06.patch > > > Ensure that jobs are placed into the correct queue according to > {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and > users should not be able to submit jobs to queues they do not have access to. > This patch builds on the existing functionality in {{FairSchedulerShim}} to > route jobs to user-specific queue based on {{fair-scheduler.xml}} > configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In > addition to configuring job routing at session connect (current behavior), > the routing is validated per submission to yarn (when impersonation is off). > A {{FileSystemWatcher}} class is included to monitor changes in the > {{fair-scheduler.xml}} file (so updates are automatically reloaded when the > file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281488#comment-15281488 ] Hive QA commented on HIVE-13293: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803166/HIVE-13293.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 68 failed/errored test(s), 9194 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-bucket_map_join_tez1.q-auto_sortmerge_join_16.q-skewjoin.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - did not produce a TEST-*.xml file TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_transform.q-union_remove_7.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join34 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join35 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_3 org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorWithinDagPriority org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
[jira] [Assigned] (HIVE-13697) ListBucketing feature does not support uppercase string.
[ https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin reassigned HIVE-13697: --- Assignee: Oleksiy Sayankin > ListBucketing feature does not support uppercase string. > > > Key: HIVE-13697 > URL: https://issues.apache.org/jira/browse/HIVE-13697 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 1.2.1 > Environment: 1.2.1 >Reporter: Hao Zhu >Assignee: Oleksiy Sayankin >Priority: Critical > > This is the feature: > https://cwiki.apache.org/confluence/display/Hive/ListBucketing > 1. Good example: > {code} > CREATE TABLE testskew (id INT, a STRING) > SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew > SELECT 123,'abc' FROM dual > union all > SELECT 123,'xyz' FROM dual > union all > SELECT 123,'others' FROM dual; > {code} > {code} > # hadoop fs -ls /user/hive/warehouse/testskew > Found 3 items > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=abc > drwxrwxrwx - mapr mapr 1 2016-05-05 14:56 > /user/hive/warehouse/testskew/a=xyz > {code} > This is good, because both "abc" and "xyz" directories got created. > 2. Bad example -- This is the issue > {code} > CREATE TABLE testskew2 (id INT, a STRING) > SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES; > set hive.mapred.supports.subdirectories=true; > set mapred.input.dir.recursive=true; > INSERT OVERWRITE TABLE testskew2 > SELECT 123, 'aus' FROM dual > union all > SELECT 123, 'US' FROM dual > union all > SELECT 123, 'others' FROM dual; > {code} > You can see, only "aus" directory got created... > {code} > # hadoop fs -ls /user/hive/warehouse/testskew2 > Found 2 items > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > drwxrwxrwx - mapr mapr 1 2016-05-05 15:11 > /user/hive/warehouse/testskew2/a=aus > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs
[ https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13338: Status: Patch Available (was: In Progress) Jump on the merry-go-round again. > Differences in vectorized_casts.q output for vectorized and non-vectorized > runs > --- > > Key: HIVE-13338 > URL: https://issues.apache.org/jira/browse/HIVE-13338 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch > > > Turn off vectorization and you get different results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs
[ https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13338: Attachment: HIVE-13338.02.patch > Differences in vectorized_casts.q output for vectorized and non-vectorized > runs > --- > > Key: HIVE-13338 > URL: https://issues.apache.org/jira/browse/HIVE-13338 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch > > > Turn off vectorization and you get different results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs
[ https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13338: Status: In Progress (was: Patch Available) > Differences in vectorized_casts.q output for vectorized and non-vectorized > runs > --- > > Key: HIVE-13338 > URL: https://issues.apache.org/jira/browse/HIVE-13338 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13338.01.patch > > > Turn off vectorization and you get different results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10566) LLAP: Vector row extraction allocates new extractors per process method call instead of just once
[ https://issues.apache.org/jira/browse/HIVE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-10566. - Resolution: Fixed Fix Version/s: (was: 1.3.0) > LLAP: Vector row extraction allocates new extractors per process method call > instead of just once > - > > Key: HIVE-10566 > URL: https://issues.apache.org/jira/browse/HIVE-10566 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 1.2.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > Extractors for unused columns (common for tables with many columns) are > created for each batch instead of just once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10566) LLAP: Vector row extraction allocates new extractors per process method call instead of just once
[ https://issues.apache.org/jira/browse/HIVE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281414#comment-15281414 ] Matt McCline commented on HIVE-10566: - No longer an issue after VecText enhancement. > LLAP: Vector row extraction allocates new extractors per process method call > instead of just once > - > > Key: HIVE-10566 > URL: https://issues.apache.org/jira/browse/HIVE-10566 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 1.2.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > Extractors for unused columns (common for tables with many columns) are > created for each batch instead of just once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12562) Enabling native fast hash table can cause incorrect results
[ https://issues.apache.org/jira/browse/HIVE-12562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-12562. - Resolution: Duplicate Release Note: HIVE-13682 > Enabling native fast hash table can cause incorrect results > --- > > Key: HIVE-12562 > URL: https://issues.apache.org/jira/browse/HIVE-12562 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Matt McCline > > Enabling "hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled" > causes incorrect results when running with LLAP. > I believe this does not happen for simple container runs. However, it's > possible that caching of these tables, or using the same table more than once > causes issues - which may be seen with container reuse. > The results vary by a small percentage. > e.g. 82270, 82267 <- Two results for the same query run back to back. > cc [~mmccline] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13245) VectorDeserializeRow throws IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-13245. - Resolution: Duplicate Release Note: HIVE-13682 > VectorDeserializeRow throws IndexOutOfBoundsException > - > > Key: HIVE-13245 > URL: https://issues.apache.org/jira/browse/HIVE-13245 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran > > When running following query on TPCDS 1000 scale, VectorDeserializeRow threw > ArrayIndexOutOfBoundsException > {code:title=Query} > SELECT `customer_address`.`ca_zip` AS `ca_zip`, >`customer_demographics`.`cd_education_status` AS > `cd_education_status`, >Sum(`store_sales`.`ss_net_paid`) AS `SUM:SS_NET_PAID:ok` > FROM `store_sales` `store_sales` >INNER JOIN `customer` `customer` >ON ( `store_sales`.`ss_customer_sk` = > `customer`.`c_customer_sk` ) >INNER JOIN `customer_address` `customer_address` >ON ( `customer`.`c_current_addr_sk` = > `customer_address`.`ca_address_sk` ) >INNER JOIN `customer_demographics` `customer_demographics` >ON ( `customer`.`c_current_cdemo_sk` = > `customer_demographics`.`cd_demo_sk` ) > WHERE ( `customer`.`c_first_sales_date_sk` > 2452300 > AND `customer_demographics`.`cd_gender` = 'F' > AND `customer`.`c_current_addr_sk` IS NOT NULL > AND `store_sales`.`ss_sold_date_sk` IS NOT NULL > AND `customer`.`c_current_cdemo_sk` IS NOT NULL ) > GROUP BY `ca_zip`, > `cd_education_status`; > {code} > {code:title=Exception} > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:354) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:59) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:59) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:356) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:62) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86) > ... 17 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:392) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:143) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:121) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at >
[jira] [Resolved] (HIVE-12896) IndexArrayOutOfBoundsException during vectorized map join
[ https://issues.apache.org/jira/browse/HIVE-12896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-12896. - Resolution: Duplicate Release Note: HIVE-13682 > IndexArrayOutOfBoundsException during vectorized map join > - > > Key: HIVE-12896 > URL: https://issues.apache.org/jira/browse/HIVE-12896 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Jason Dere >Assignee: Gopal V > Attachments: HIVE-12896.tar.gz, query.explain.txt > > > Trying a simple join on a couple of the TPCDS tables. Query works with > vectorization disabled. > {noformat} > select c_customer_sk, c_customer_id from > tpcds_bin_partitioned_orc_10.customer, > tpcds_bin_partitioned_orc_10.customer_demographics where c_current_cdemo_sk = > cd_demo_sk limit 20 > {noformat} > {noformat} > ], TaskAttempt 3 failed, info=[Error: Failure while running task: > attempt_1448429572030_8225_4_01_03_3:java.lang.RuntimeException: > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:351) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:59) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:59) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:354) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86) > ... 17 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:385) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852) > at > org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:115) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:114) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:168) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) > ... 18 more > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152) > at > org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:345) > at > org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:684) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:183) > at >
[jira] [Commented] (HIVE-13682) EOFException with fast hashtable
[ https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281412#comment-15281412 ] Matt McCline commented on HIVE-13682: - More extensive Fast Hash Table Unit Tests uncovered problems in reading the lengths of following Fast Hash Map records. See VectorMapJoinFastValueStore. And, a few minor issues with maintaining keysAssigned counter. Lots of new Unit Tests for Fast SerializeWrite/DeserializeRead. > EOFException with fast hashtable > > > Key: HIVE-13682 > URL: https://issues.apache.org/jira/browse/HIVE-13682 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Matt McCline > Attachments: HIVE-13682.01.patch > > > While testing something else on recent master, w/Tez 0.8.3, this happened > (TPCDS q27) > {noformat} > Caused by: java.util.concurrent.ExecutionException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399) > ... 20 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131) > ... 4 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) > at > org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) > ... 9 more > {noformat} > There's no error if fast hashtable is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13682) EOFException with fast hashtable
[ https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13682: Attachment: HIVE-13682.01.patch > EOFException with fast hashtable > > > Key: HIVE-13682 > URL: https://issues.apache.org/jira/browse/HIVE-13682 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Matt McCline > Attachments: HIVE-13682.01.patch > > > While testing something else on recent master, w/Tez 0.8.3, this happened > (TPCDS q27) > {noformat} > Caused by: java.util.concurrent.ExecutionException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399) > ... 20 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131) > ... 4 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) > at > org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) > ... 9 more > {noformat} > There's no error if fast hashtable is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13682) EOFException with fast hashtable
[ https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13682: Status: Patch Available (was: Open) > EOFException with fast hashtable > > > Key: HIVE-13682 > URL: https://issues.apache.org/jira/browse/HIVE-13682 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Matt McCline > Attachments: HIVE-13682.01.patch > > > While testing something else on recent master, w/Tez 0.8.3, this happened > (TPCDS q27) > {noformat} > Caused by: java.util.concurrent.ExecutionException: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399) > ... 20 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131) > ... 4 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54) > at > org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98) > ... 9 more > {noformat} > There's no error if fast hashtable is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13621) compute stats in certain cases fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281366#comment-15281366 ] Hive QA commented on HIVE-13621: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12803154/HIVE-13621.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 66 failed/errored test(s), 9215 tests executed *Failed tests:* {noformat} TestCompactor - did not produce a TEST-*.xml file TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more - did not produce a TEST-*.xml file TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt15.q-join39.q-avro_joins_native.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_avro_decimal_native org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_column_access_stats org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby1_map_skew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby6_map org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join30 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_array org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_reorder3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf_matchpath org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_percentile org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union24 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union31 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union34 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_groupby_3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_2 org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf org.apache.hadoop.hive.metastore.TestHiveMetaStoreWithEnvironmentContext.testEnvironmentContext org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics
[jira] [Updated] (HIVE-13747) NullPointerException thrown by Executors causes job can't be finished
[ https://issues.apache.org/jira/browse/HIVE-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HIVE-13747: - Issue Type: Sub-task (was: Bug) Parent: HIVE-7292 > NullPointerException thrown by Executors causes job can't be finished > - > > Key: HIVE-13747 > URL: https://issues.apache.org/jira/browse/HIVE-13747 > Project: Hive > Issue Type: Sub-task >Reporter: Walter Su > > stderr log from one executor. > {noformat} > 16/05/12 15:56:51 INFO exec.MapJoinOperator: Initializing operator MAPJOIN[10] > 16/05/12 15:56:51 INFO exec.CommonJoinOperator: JOIN > struct<_col0:int,_col1:string,_col2:int,_col3:string> totalsz = 4 > 16/05/12 15:56:51 INFO spark.HashTableLoader: *** Load from HashTable for > input file: hdfs://test-cluster/user/hive/warehouse-store2/pokes/kv1.txt > 16/05/12 15:56:51 INFO spark.HashTableLoader: Load back all hashtable > files from tmp folder > uri:hdfs://test-cluster/tmp/hive/hadoop/4062fcea-6759-4340-b4be-5e83181e68bf/hive_2016-05-12_15-56-50_196_4198620026582283764-1/-mr-10004/HashTable-Stage-1/MapJoin-mapfile11--.hashtable > 16/05/12 15:56:51 INFO exec.MapJoinOperator: Exception loading hash tables. > Clearing partially loaded hash table containers. > 16/05/12 15:56:51 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 > (TID 3) > java.lang.RuntimeException: Map operator initialization failed: > org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:121) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:57) > at > org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieveAsync(ObjectCache.java:63) > at > org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieveAsync(ObjectCacheWrapper.java:46) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:173) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:112) > ... 15 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:151) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:299) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:180) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:176) > at > org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:55) > ... 23 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.isDedicatedCluster(SparkUtilities.java:118) > at > org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:158) > at >
[jira] [Updated] (HIVE-13746) Data duplication when insert overwrite
[ https://issues.apache.org/jira/browse/HIVE-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Wailliam updated HIVE-13746: - Description: Data duplication when insert overwrite .The old data cannot be deleted > Data duplication when insert overwrite > --- > > Key: HIVE-13746 > URL: https://issues.apache.org/jira/browse/HIVE-13746 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Bill Wailliam >Priority: Critical > > Data duplication when insert overwrite .The old data cannot be deleted -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13370) Add test for HIVE-11470
[ https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-13370: Status: Open (was: Patch Available) > Add test for HIVE-11470 > --- > > Key: HIVE-13370 > URL: https://issues.apache.org/jira/browse/HIVE-13370 > Project: Hive > Issue Type: Bug >Reporter: Sushanth Sowmyan >Assignee: Sushanth Sowmyan > Attachments: HIVE-13370.patch > > > HIVE-11470 added capability to handle NULL dynamic partitioning keys > properly. However, it did not add a test for the case, we should have one so > we don't have future regressions of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)