[jira] [Commented] (HIVE-13730) hybridgrace_hashjoin_1.q test gets stuck

2016-05-12 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282431#comment-15282431
 ] 

Wei Zheng commented on HIVE-13730:
--

Here's an todo item after HIVE-13755 is fixed.
Right now memory manager doesn't guarantee to allocate enough memory for each 
table in n-way join case. After fixing that issue, this assert below can be put 
into HybridHashTableContainer's cstr after the variables have been determined.
{code}
assert writeBufferSize * (numPartitions - numPartitionsSpilledOnCreation) 
<= memoryThreshold :
"hive.auto.convert.join.noconditionaltask.size is set too low. It's not 
enough to " +
"allocate " + (numPartitions - numPartitionsSpilledOnCreation) + " 
partitions (each " +
" of size " + writeBufferSize;
{code}

> hybridgrace_hashjoin_1.q test gets stuck
> 
>
> Key: HIVE-13730
> URL: https://issues.apache.org/jira/browse/HIVE-13730
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Wei Zheng
>Priority: Blocker
> Attachments: HIVE-13730.1.patch
>
>
> I am seeing hybridgrace_hashjoin_1.q getting stuck on master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size

2016-05-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13751:
-
Attachment: HIVE-13751.2.patch

[~jdere] Addressed your review comments in this patch. 

> LlapOutputFormatService should have a configurable send buffer size
> ---
>
> Key: HIVE-13751
> URL: https://issues.apache.org/jira/browse/HIVE-13751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13751.1.patch, HIVE-13751.2.patch
>
>
> Netty channel buffer size is hard-coded 128KB now. It should be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)

2016-05-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282397#comment-15282397
 ] 

Ashutosh Chauhan commented on HIVE-13743:
-

Thanks [~rajesh.balamohan] for verification. [~spena] can you take a quick look 
at the patch?

> Data move codepath is broken with hive (2.1.0-SNAPSHOT)
> ---
>
> Key: HIVE-13743
> URL: https://issues.apache.org/jira/browse/HIVE-13743
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13743.patch
>
>
> Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop 
> 2.8.0-snapshot.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path 
> not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333)
> at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy30.getEZForPath(Unknown 
> Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/
> ...
> ...
> ...
> 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to 
> move source 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002
>  to destination 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> {noformat}
> https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836
> hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing 
> FileNotFoundException as the destf is not present yet.  This causes moveFile 
> to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-12 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282359#comment-15282359
 ] 

Naveen Gangam commented on HIVE-13749:
--

I am still analyzing the heap, but appears like they are all stashed away in 
HashMap, perhaps in a threadlocal. I do not have the allocation stack for these 
objects so I cannot tell what part of the code creates these instances.

Just running a simple query iteratively via beeline where it connects + 
disconnects every iteration, I observe the leak. Not sure if it is the same 
workload as in the environment where the heap dump was generated from.

I am currently running some tests with a change to remove it from the 
threadlocal at
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L811
and
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L483



> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13728) TestHBaseSchemaTool fails on master

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282352#comment-15282352
 ] 

Hive QA commented on HIVE-13728:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803287/HIVE-13728.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 9916 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_transform.q-union_remove_7.q-date_udf.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_gby_empty
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby4_noskew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join37
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_transform_ppr1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_max
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_nested_udf
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks
org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge
org.apache.hadoop.hive.metastore.TestMetaStoreEventListenerOnlyOnCommit.testEventStatus
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAddPartitionWithValidPartVal
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithCommas
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithUnicode
org.apache.hadoop.hive.metastore.TestPartitionNameWhitelistValidation.testAppendPartitionWithValidCharacters
org.apache.hadoop.hive.metastore.TestRemoteUGIHiveMetaStoreIpAddress.testIpAddress
org.apache.hadoop.hive.ql.exec.tez.TestHostAffinitySplitLocationProvider.testOrcSplitsLocationAffinity
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testShowLocksFilterOptions
org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls
org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener.org.apache.hadoop.hive.ql.security.TestMultiAuthorizationPreEventListener
org.apache.hive.hcatalog.api.TestHCatClient.org.apache.hive.hcatalog.api.TestHCatClient
org.apache.hive.hcatalog.api.repl.commands.TestCommands.org.apache.hive.hcatalog.api.repl.commands.TestCommands
org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle

[jira] [Updated] (HIVE-13754) Fix resource leak in HiveClientCache

2016-05-12 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-13754:
---
Attachment: HIVE-13754.patch
HIVE-13754-branch-1.patch

Attached patches for branch-1 and master.

> Fix resource leak in HiveClientCache
> 
>
> Key: HIVE-13754
> URL: https://issues.apache.org/jira/browse/HIVE-13754
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: HIVE-13754-branch-1.patch, HIVE-13754.patch
>
>
> Found that the {{users}} reference count can go into negative values, which 
> prevents {{tearDownIfUnused}} from closing the client connection when called.
> This leads to a build up of clients which have been evicted from the cache, 
> are no longer in use, but have not been shutdown.
> GC will eventually call {{finalize}}, which forcibly closes the connection 
> and cleans up the client, but I have seen as many as several hundred open 
> client connections as a result.
> The main resource for this is caused by RetryingMetaStoreClient, which will 
> call {{reconnect}} on acquire, which calls {{close}}. This will decrement 
> {{users}} to -1 on the reconnect, then acquire will increase this to 0 while 
> using it, and back to -1 when it releases it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282303#comment-15282303
 ] 

Thejas M Nair edited comment on HIVE-13708 at 5/13/16 12:59 AM:


This change in current patch to CSVSerde breaks backward compatbility for 
anyone who had a scripted create table command with a non string column. Those 
statements would fail now.
If we consider CSVSerde in isolation, the best thing to do about it is to 
address HIVE-13709, ie support other types as supported by LazySimpleSerde. 
That would lead to correct results and also be backward compatible.

Regarding the generic change applicable to any such serde - It is a difficult 
choice between allowing logically incorrect results and backward compatibility. 
I think if we also make the changes in HIVE-13709, only users who use custom 
serde with same limitations (but without error checks) and also use unsupported 
types for that serde would be affected. That set is likely to be very small. I 
would vote for making this incompatible change and fix the logical correctness 
issue.




was (Author: thejas):
This change to CSVSerde breaks backward compatbility for anyone who had a 
scripted create table command with a non string column. Those statements would 
fail now.
If we consider CSVSerde in isolation, the best thing to do about it is to 
address HIVE-13709, ie support other types as supported by LazySimpleSerde. 
That would lead to correct results and also be backward compatible.

Regarding the generic change applicable to any such serde - It is a difficult 
choice between allowing logically incorrect results and backward compatibility. 
I think if we also make the changes in HIVE-13709, only users who use custom 
serde with same limitations (but without error checks) and also use unsupported 
types for that serde would be affected. That set is likely to be very small. I 
would vote for making this incompatible change and fix the logical correctness 
issue.



> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282303#comment-15282303
 ] 

Thejas M Nair commented on HIVE-13708:
--

This change to CSVSerde breaks backward compatbility for anyone who had a 
scripted create table command with a non string column. Those statements would 
fail now.
If we consider CSVSerde in isolation, the best thing to do about it is to 
address HIVE-13709, ie support other types as supported by LazySimpleSerde. 
That would lead to correct results and also be backward compatible.

Regarding the generic change applicable to any such serde - It is a difficult 
choice between allowing logically incorrect results and backward compatibility. 
I think if we also make the changes in HIVE-13709, only users who use custom 
serde with same limitations (but without error checks) and also use unsupported 
types for that serde would be affected. That set is likely to be very small. I 
would vote for making this incompatible change and fix the logical correctness 
issue.



> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)

2016-05-12 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282299#comment-15282299
 ] 

Rajesh Balamohan commented on HIVE-13743:
-

[~ashutoshc] - Checked the patch in Hadoop 2.8 cluster and patch works as 
expected. No longer seeing this issue.

> Data move codepath is broken with hive (2.1.0-SNAPSHOT)
> ---
>
> Key: HIVE-13743
> URL: https://issues.apache.org/jira/browse/HIVE-13743
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13743.patch
>
>
> Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop 
> 2.8.0-snapshot.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path 
> not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333)
> at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy30.getEZForPath(Unknown 
> Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/
> ...
> ...
> ...
> 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to 
> move source 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002
>  to destination 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> {noformat}
> https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836
> hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing 
> FileNotFoundException as the destf is not present yet.  This causes moveFile 
> to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282291#comment-15282291
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13708:
--

[~thejas] I checked whether we could do this in a generic way. As you 
mentioned, we can perform a deep check of the object inspector after 
initialize() and see if the types will match the column type in the table 
definition.  My concern here is if it is backward compatible or will it break 
things that used to work previously. If we haven't enforced this rule 
previously, how will we expect the custom serde developer henceforth to know 
that this is an enforced rule in Hive. Also, it looked cleaner to implement 
this check in the actual serde itself (like for e.g. RegexSerDe has done a 
similar check in initialize()) since it seems that it is the responsibility of 
the Serde to interpret the data correctly and not the query processor. Let me 
know your feedback.

Thanks
Hari

> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Attachment: HIVE-13753.3.patch

Makes sense. Patch 3 made that field client final. Thanks [~vgumashta] for the 
review!

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch, 
> HIVE-13753.3.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282277#comment-15282277
 ] 

Vaibhav Gumashta commented on HIVE-13753:
-

+1 pending tests. I would probably make the IMetaStoreClient member within the 
SynchronizedMetaStoreClient a final too.

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282269#comment-15282269
 ] 

Thejas M Nair commented on HIVE-13708:
--

[~hsubramaniyan] Can we do this in a generic way so that it is applicable to 
any serde that doesn't support the types being specified in create-table ?
There are possibly other user created serdes that could also have this issue.
I haven't looked deeper into the object inspectors. How about checking the 
objectinspector after initialize ? It seems like the types it will return can 
be determined from that. cc [~ashutoshc]

I didn't mean this to be specifically about CSVSerde, but more general about 
the hive serde interaction. The specific change to that serde that I would like 
to see is in HIVE-13709 .


> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282266#comment-15282266
 ] 

Ashutosh Chauhan commented on HIVE-13708:
-

Patch is addressing different issue than the description. Would you like to 
update description of jira to reflect your patch?

> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Attachment: HIVE-13753.2.patch

Patch 2.

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch, HIVE-13753.2.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282263#comment-15282263
 ] 

Wei Zheng commented on HIVE-13753:
--

Right, I should have removed that getter. Thanks for catching that.

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282262#comment-15282262
 ] 

Vaibhav Gumashta commented on HIVE-13753:
-

[~wzheng] Is there a need to expose the underlying IMetaStoreClient object via 
SynchronizedMetaStoreClient? 

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Attachment: HIVE-13753.1.patch

Previous patch has wrong name. Corrected it.

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13753.1.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Attachment: (was: HIVE-13725.1.patch)

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13708:
-
Status: Patch Available  (was: Open)

> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size

2016-05-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282246#comment-15282246
 ] 

Prasanth Jayachandran commented on HIVE-13751:
--

[~jdere] I have tested this patch locally and this conf seems to work fine. 
Although I am seeing different set of errors possibly because of other errors. 
I will create bugs for them later. Can you please review this patch?

> LlapOutputFormatService should have a configurable send buffer size
> ---
>
> Key: HIVE-13751
> URL: https://issues.apache.org/jira/browse/HIVE-13751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13751.1.patch
>
>
> Netty channel buffer size is hard-coded 128KB now. It should be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size

2016-05-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13751:
-
Status: Patch Available  (was: Open)

> LlapOutputFormatService should have a configurable send buffer size
> ---
>
> Key: HIVE-13751
> URL: https://issues.apache.org/jira/browse/HIVE-13751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13751.1.patch
>
>
> Netty channel buffer size is hard-coded 128KB now. It should be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13708:
-
Attachment: HIVE-13708.1.patch

cc [~ashutoshc] for review.

> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
> Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Attachment: HIVE-13725.1.patch

Upload patch 1. [~ekoifman] Can you review please?

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13725.1.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13753) Make metastore client thread safe in DbTxnManager

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13753:
-
Status: Patch Available  (was: Open)

> Make metastore client thread safe in DbTxnManager
> -
>
> Key: HIVE-13753
> URL: https://issues.apache.org/jira/browse/HIVE-13753
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13725.1.patch
>
>
> The fact that multiple threads sharing the same metastore client which is 
> used for RPC to Thrift is not thread safe.
> Race condition can happen when one sees "out of sequence response" error 
> message from Thrift server. That means the response from the Thrift server is 
> for a different request (by a different thread).
> Solution will be to synchronize methods from the client side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11550) ACID queries pollute HiveConf

2016-05-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282233#comment-15282233
 ] 

Alan Gates commented on HIVE-11550:
---

+1

> ACID queries pollute HiveConf
> -
>
> Key: HIVE-11550
> URL: https://issues.apache.org/jira/browse/HIVE-11550
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11550.1.patch, HIVE-11550.patch
>
>
> HiveConf is a SessionState level object.  Some ACID related logic makes 
> changes to it (which are meant to be per query) but become per SessionState.
> See SemanticAnalyzer.checkAcidConstraints()
> Also note   HiveConf.setVar(conf, 
> HiveConf.ConfVars.DYNAMICPARTITIONINGMODE, "nonstrict");
> in UpdateDeleteSemancitAnalzyer
> [~alangates], do you know of other cases or ideas on how to deal with this 
> differently?
> _SortedDynPartitionOptimizer.process()_ is the place to have the logic to do 
> _conf.setBoolVar(ConfVars.HIVEOPTSORTDYNAMICPARTITION, false);_ on per query 
> basis



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282227#comment-15282227
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13708:
--

Couple of points :
1.  The original document says :
CREATE TABLE my_table(a string, b string, ...)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'

I believe this supports strictly string columns and not anything else(not even 
variants like varchar).
Please correct me if this is wrong.

2.  The description for this jira says:
CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with

There is not much we can do for 'com.bizo.hive.serde.csv.CSVSerde'  in Hive. I 
will upload a patch that will fix for  
'org.apache.hadoop.hive.serde2.OpenCSVSerde'.

Thanks
Hari



> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13742) Hive ptest has many failures due to metastore connection refused

2016-05-12 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282220#comment-15282220
 ] 

Szehon Ho commented on HIVE-13742:
--

There are potentially several tests running concurrently on the same Ptest 
slave, I haven't taken a close look and am not sure if it can cause corruption 
or not but just a thought.

> Hive ptest has many failures due to metastore connection refused
> 
>
> Key: HIVE-13742
> URL: https://issues.apache.org/jira/browse/HIVE-13742
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergio Peña
> Attachments: hive.log
>
>
> The following exception is thrown on the Hive ptest with many tests, and it 
> is due to some Derby database issues:
> {noformat}
> 016-05-11T15:46:25,123 INFO  [Thread-2[]]: metastore.HiveMetaStore 
> (HiveMetaStore.java:newRawStore(563)) - 0: Opening raw store with 
> implementation class:org.apache.hadoop.hive.metastore.ObjectStore
> 2016-05-11T15:46:25,175 INFO  [Thread-2[]]: metastore.ObjectStore 
> (ObjectStore.java:initialize(324)) - ObjectStore, initialize called
> 2016-05-11T15:46:25,966 DEBUG [Thread-2[]]: bonecp.BoneCPDataSource 
> (BoneCPDataSource.java:getConnection(119)) - JDBC URL = 
> jdbc:derby:;databaseName=/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db;create=true,
>  Username = APP, partitions = 1, max (per partition) = 10, min (per 
> partition) = 0, idle max age = 60 min, idle test period = 240 min, strategy = 
> DEFAULT
> 2016-05-11T15:46:26,003 ERROR [Thread-2[]]: Datastore.Schema 
> (Log4JLogger.java:error(125)) - Failed initialising database.
> org.datanucleus.exceptions.NucleusDataStoreException: Unable to open a test 
> connection to the given database. JDBC url = 
> jdbc:derby:;databaseName=/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db;create=true,
>  username = APP. Terminating connection pool (set lazyInit to true if you 
> expect to start your database after your app). Original Exception: --
> java.sql.SQLException: Failed to create database 
> '/home/hiveptest/54.177.132.113-hiveptest-1/apache-github-source-source/itests/hive-unit/target/tmpTestFilterHooksmetastore_db',
>  see the next exception for details.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
>   at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
>   at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>   at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>   at org.apache.derby.jdbc.Driver20.connect(Unknown Source)
>   at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:208)
>   at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
>   at com.jolbox.bonecp.BoneCP.(BoneCP.java:416)
>   at 
> com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120)
>   at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:483)
>   at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:296)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
>   at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>   at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
>   at 
> org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
>   at 
> org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:420)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:821)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:338)
>   at 
> 

[jira] [Assigned] (HIVE-13708) Create table should verify datatypes supported by the serde

2016-05-12 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan reassigned HIVE-13708:


Assignee: Hari Sankar Sivarama Subramaniyan

> Create table should verify datatypes supported by the serde
> ---
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Critical
>
> As [~Goldshuv] mentioned in HIVE-.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE- -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 11.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice  string  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13562) Enable vector bridge for all non-vectorized udfs

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-13562:
---

Assignee: Matt McCline

> Enable vector bridge for all non-vectorized udfs
> 
>
> Key: HIVE-13562
> URL: https://issues.apache.org/jira/browse/HIVE-13562
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Ashutosh Chauhan
>Assignee: Matt McCline
>
> Mechanism already exists for this via {{VectorUDFAdaptor}} but we have 
> arbitrarily hand picked few udfs to go through it. I think we should enable 
> this by default for all udfs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13752) Mini HDFS Cluster fails to start on trunk

2016-05-12 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou resolved HIVE-13752.
--
Resolution: Invalid

Resolved this since it goes to wrong place due to JIRA maintenance.

> Mini HDFS Cluster fails to start on trunk
> -
>
> Key: HIVE-13752
> URL: https://issues.apache.org/jira/browse/HIVE-13752
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaobing Zhou
>
> It's been noticed that Mini HDFS Cluster fails to start on trunk, blocking 
> unit tests and Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13682) EOFException with fast hashtable

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13682:

Attachment: HIVE-13682.01.patch

> EOFException with fast hashtable
> 
>
> Key: HIVE-13682
> URL: https://issues.apache.org/jira/browse/HIVE-13682
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
> Attachments: HIVE-13682.01.patch
>
>
> While testing something else on recent master, w/Tez 0.8.3, this happened 
> (TPCDS q27)
> {noformat}
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
>   ... 5 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
>   ... 9 more
> {noformat}
> There's no error if fast hashtable is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13682) EOFException with fast hashtable

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13682:

Attachment: (was: HIVE-13682.01.patch)

> EOFException with fast hashtable
> 
>
> Key: HIVE-13682
> URL: https://issues.apache.org/jira/browse/HIVE-13682
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
>
> While testing something else on recent master, w/Tez 0.8.3, this happened 
> (TPCDS q27)
> {noformat}
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
>   ... 5 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
>   ... 9 more
> {noformat}
> There's no error if fast hashtable is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13752) Mini HDFS Cluster fails to start on trunk

2016-05-12 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282113#comment-15282113
 ] 

Xiaobing Zhou commented on HIVE-13752:
--

Here's the expcetion:
{noformat}
Running org.apache.hadoop.hdfs.TestAsyncDFSRename
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 15.756 sec <<< 
FAILURE! - in org.apache.hadoop.hdfs.TestAsyncDFSRename
testAsyncRenameWithOverwrite(org.apache.hadoop.hdfs.TestAsyncDFSRename)  Time 
elapsed: 15.58 sec  <<< ERROR!
java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1345)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:848)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:482)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:441)
at 
org.apache.hadoop.hdfs.TestAsyncDFSRename.testAsyncRenameWithOverwrite(TestAsyncDFSRename.java:69)
{noformat}

> Mini HDFS Cluster fails to start on trunk
> -
>
> Key: HIVE-13752
> URL: https://issues.apache.org/jira/browse/HIVE-13752
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaobing Zhou
>
> It's been noticed that Mini HDFS Cluster fails to start on trunk, blocking 
> unit tests and Jenkins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13727) Getting error Failed rule: 'orderByClause clusterByClause distributeByClause sortByClause limitClause can only be applied to the whole union.' in subquery

2016-05-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282087#comment-15282087
 ] 

Ashutosh Chauhan commented on HIVE-13727:
-

[~prongs] Was this query used to work prior to HIVE-9039 commit ?

> Getting error Failed rule: 'orderByClause clusterByClause distributeByClause 
> sortByClause limitClause can only be applied to the whole union.' in subquery 
> ---
>
> Key: HIVE-13727
> URL: https://issues.apache.org/jira/browse/HIVE-13727
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>
> The error comes in the following query:
> {noformat}
> SELECT *
> FROM
>   (SELECT *
>FROM srcpart a
>WHERE a.ds = '2008-04-08'
>  AND a.hr = '11'
>ORDER BY a.key LIMIT 5
>UNION ALL
>SELECT *
>FROM srcpart b
>WHERE b.ds = '2008-04-08'
>  AND b.hr = '14'
>ORDER BY b.key LIMIT 5) subq
> ORDER BY KEY LIMIT 5
> {noformat}
> But the following query works:
> {noformat}
> SELECT *
> FROM
>   (SELECT *
>FROM
>  (SELECT *
>   FROM srcpart a
>   WHERE a.ds = '2008-04-08'
> AND a.hr = '11'
>   ORDER BY a.key LIMIT 5) pa
>UNION ALL SELECT *
>FROM
>  (SELECT *
>   FROM srcpart b
>   WHERE b.ds = '2008-04-08'
> AND b.hr = '14'
>   ORDER BY b.key LIMIT 5) pb) subq
> ORDER BY KEY LIMIT 5
> {noformat}
> The queries are logically identical, the query that's rejected has dummy 
> select * clauses around the sub-queries. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11947) mssql upgrade scripts contains invalid character

2016-05-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282060#comment-15282060
 ] 

Pengcheng Xiong commented on HIVE-11947:


+1

> mssql upgrade scripts contains invalid character
> 
>
> Key: HIVE-11947
> URL: https://issues.apache.org/jira/browse/HIVE-11947
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.2.0, 1.1.0
>Reporter: Huan Huang
>Assignee: Huan Huang
> Attachments: HIVE-11947.patch
>
>
> upgrade scripts dont execute as a result



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13608) We should provide better error message while constraints with duplicate names are created

2016-05-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282045#comment-15282045
 ] 

Ashutosh Chauhan commented on HIVE-13608:
-

+1 pending tests.

> We should provide better error message while constraints with duplicate names 
> are created
> -
>
> Key: HIVE-13608
> URL: https://issues.apache.org/jira/browse/HIVE-13608
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Metastore
>Affects Versions: 2.0.0
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, 
> HIVE-13608.3.patch
>
>
> {code}
> PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t1
> POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) 
> disable novalidate)
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@t1
> PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t2
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct 
> MetaStore DB connections, we don't support retries at the client level.)
> {code}
> In the above case, it seems like useful error message is lost. It looks like 
> a  generic problem with metastore server/client exception handling and 
> message propagation. Seems like exception parsing logic of 
> RetryingMetaStoreClient::invoke() needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13608:

Affects Version/s: 2.0.0

> We should provide better error message while constraints with duplicate names 
> are created
> -
>
> Key: HIVE-13608
> URL: https://issues.apache.org/jira/browse/HIVE-13608
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Metastore
>Affects Versions: 2.0.0
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, 
> HIVE-13608.3.patch
>
>
> {code}
> PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t1
> POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) 
> disable novalidate)
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@t1
> PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t2
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct 
> MetaStore DB connections, we don't support retries at the client level.)
> {code}
> In the above case, it seems like useful error message is lost. It looks like 
> a  generic problem with metastore server/client exception handling and 
> message propagation. Seems like exception parsing logic of 
> RetryingMetaStoreClient::invoke() needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13608:

Target Version/s: 2.1.0

> We should provide better error message while constraints with duplicate names 
> are created
> -
>
> Key: HIVE-13608
> URL: https://issues.apache.org/jira/browse/HIVE-13608
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Metastore
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, 
> HIVE-13608.3.patch
>
>
> {code}
> PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t1
> POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) 
> disable novalidate)
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@t1
> PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t2
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct 
> MetaStore DB connections, we don't support retries at the client level.)
> {code}
> In the above case, it seems like useful error message is lost. It looks like 
> a  generic problem with metastore server/client exception handling and 
> message propagation. Seems like exception parsing logic of 
> RetryingMetaStoreClient::invoke() needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13269:

Target Version/s: 2.1.0

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13068) Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13068:

Target Version/s: 2.1.0

> Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II
> ---
>
> Key: HIVE-13068
> URL: https://issues.apache.org/jira/browse/HIVE-13068
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13068.01.patch, HIVE-13068.01.patch, 
> HIVE-13068.02.patch, HIVE-13068.03.patch, HIVE-13068.patch
>
>
> After HIVE-12543 went in, we need follow-up work to disable the last call to 
> ConstantPropagate in Hive. This probably implies work on extending the 
> constant folding logic in Calcite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13608) We should provide better error message while constraints with duplicate names are created

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13608:

Component/s: Metastore
 Diagnosability

> We should provide better error message while constraints with duplicate names 
> are created
> -
>
> Key: HIVE-13608
> URL: https://issues.apache.org/jira/browse/HIVE-13608
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability, Metastore
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13608.1.patch, HIVE-13608.2.patch, 
> HIVE-13608.3.patch
>
>
> {code}
> PREHOOK: query: create table t1(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t1
> POSTHOOK: query: create table t1(x int, constraint pk1 primary key (x) 
> disable novalidate)
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@t1
> PREHOOK: query: create table t2(x int, constraint pk1 primary key (x) disable 
> novalidate)
> PREHOOK: type: CREATETABLE
> PREHOOK: Output: database:default
> PREHOOK: Output: default@t2
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct 
> MetaStore DB connections, we don't support retries at the client level.)
> {code}
> In the above case, it seems like useful error message is lost. It looks like 
> a  generic problem with metastore server/client exception handling and 
> message propagation. Seems like exception parsing logic of 
> RetryingMetaStoreClient::invoke() needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13750:

Target Version/s: 2.1.0

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13602) TPCH q16 return wrong result when CBO is on

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13602:

Affects Version/s: (was: 1.3.0)
 Target Version/s: 2.1.0

> TPCH q16 return wrong result when CBO is on
> ---
>
> Key: HIVE-13602
> URL: https://issues.apache.org/jira/browse/HIVE-13602
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0, 1.2.2
>Reporter: Nemon Lou
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, 
> HIVE-13602.04.patch, HIVE-13602.05.patch, calcite_cbo_bad.out, 
> calcite_cbo_good.out, explain_cbo_bad_part1.out, explain_cbo_bad_part2.out, 
> explain_cbo_bad_part3.out, explain_cbo_good(rewrite)_part1.out, 
> explain_cbo_good(rewrite)_part2.out, explain_cbo_good(rewrite)_part3.out
>
>
> Running tpch with factor 2, 
> q16 returns 1,160 rows when CBO is on,
> while returns 24,581 rows when CBO is off.
> See attachment for detail .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11160) Auto-gather column stats

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11160:

Target Version/s: 2.1.0

> Auto-gather column stats
> 
>
> Key: HIVE-11160
> URL: https://issues.apache.org/jira/browse/HIVE-11160
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11160.01.patch, HIVE-11160.02.patch, 
> HIVE-11160.03.patch, HIVE-11160.04.patch, HIVE-11160.05.patch, 
> HIVE-11160.06.patch, HIVE-11160.07.patch, HIVE-11160.08.patch, 
> HIVE-11160.09.patch
>
>
> Hive will collect table stats when set hive.stats.autogather=true during the 
> INSERT OVERWRITE command. And then the users need to collect the column stats 
> themselves using "Analyze" command. In this patch, the column stats will also 
> be collected automatically. More specifically, INSERT OVERWRITE will 
> automatically create new column stats. INSERT INTO will automatically merge 
> new column stats with existing ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13567) create ColumnStatsAutoGatherContext

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13567:

Target Version/s: 2.1.0

> create ColumnStatsAutoGatherContext
> ---
>
> Key: HIVE-13567
> URL: https://issues.apache.org/jira/browse/HIVE-13567
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13566) enable merging of bit vectors for insert into

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13566:

Target Version/s: 2.1.0

> enable merging of bit vectors for insert into
> -
>
> Key: HIVE-13566
> URL: https://issues.apache.org/jira/browse/HIVE-13566
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13249:
-
Attachment: HIVE-13249.10.patch

patch 10 for review

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.10.patch, 
> HIVE-13249.2.patch, HIVE-13249.3.patch, HIVE-13249.4.patch, 
> HIVE-13249.5.patch, HIVE-13249.6.patch, HIVE-13249.7.patch, 
> HIVE-13249.8.patch, HIVE-13249.9.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13662) Set file permission and ACL in file sink operator

2016-05-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-13662:
--

Assignee: Pengcheng Xiong

> Set file permission and ACL in file sink operator
> -
>
> Key: HIVE-13662
> URL: https://issues.apache.org/jira/browse/HIVE-13662
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Pengcheng Xiong
>
> As suggested 
> [here|https://issues.apache.org/jira/browse/HIVE-13572?focusedCommentId=15254438=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15254438].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13662) Set file permission and ACL in file sink operator

2016-05-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281997#comment-15281997
 ] 

Pengcheng Xiong commented on HIVE-13662:


assigned to myself as per [~ashutoshc]'s request. :)

> Set file permission and ACL in file sink operator
> -
>
> Key: HIVE-13662
> URL: https://issues.apache.org/jira/browse/HIVE-13662
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Pengcheng Xiong
>
> As suggested 
> [here|https://issues.apache.org/jira/browse/HIVE-13572?focusedCommentId=15254438=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15254438].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281995#comment-15281995
 ] 

Wei Zheng commented on HIVE-13249:
--

The Exception being caught from startHouseKeeperService is thrown by the class 
instantiation:
{code}
openTxnsCounter = (HouseKeeperService)c.newInstance();
{code}
But I think I can move this line into the try block, and make this method not 
throw exception, so that we can best effort. I don't think we want to fail all 
txns just because we cannot start the counter service.

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281948#comment-15281948
 ] 

Hive QA commented on HIVE-13453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803278/HIVE-13453.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 104 failed/errored test(s), 9195 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-bucket_map_join_tez1.q-auto_sortmerge_join_16.q-skewjoin.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-load_dyn_part5.q-load_dyn_part2.q-skewjoinopt16.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf_stats_opt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_char
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_varchar
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union36
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_type_chk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_gby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_order_null
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_decimal
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_type_chk
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_7

[jira] [Updated] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size

2016-05-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13751:
-
Attachment: HIVE-13751.1.patch

> LlapOutputFormatService should have a configurable send buffer size
> ---
>
> Key: HIVE-13751
> URL: https://issues.apache.org/jira/browse/HIVE-13751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13751.1.patch
>
>
> Netty channel buffer size is hard-coded 128KB now. It should be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Attachment: HIVE-13750.patch

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13750.patch
>
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13749) Memory leak in Hive Metastore

2016-05-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281918#comment-15281918
 ] 

Thejas M Nair edited comment on HIVE-13749 at 5/12/16 6:59 PM:
---

What is retaining them as per MAT ?



was (Author: thejas):
Any ideas to what is causing them to be retained ?


> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13750:
---
Status: Patch Available  (was: In Progress)

> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13750) Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer when possible

2016-05-12 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13750 started by Jesus Camacho Rodriguez.
--
> Avoid additional shuffle stage created by Sorted Dynamic Partition Optimizer 
> when possible
> --
>
> Key: HIVE-13750
> URL: https://issues.apache.org/jira/browse/HIVE-13750
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Extend ReduceDedup to remove additional shuffle stage created by sorted 
> dynamic partition optimizer when possible, thus avoiding unnecessary work.
> By [~ashutoshc]:
> {quote}
> Currently, if config is on Sorted Dynamic Partition Optimizer (SDPO) 
> unconditionally adds an extra shuffle stage. If sort columns of previous 
> shuffle and partitioning columns of table match, reduce sink deduplication 
> optimizer removes extra shuffle stage, thus bringing down overhead to zero. 
> However, if they don’t match, we end up doing extra shuffle. This can be 
> improved since we can add table partition columns as a sort columns on 
> earlier shuffle and avoid this extra shuffle. This ensures that in cases 
> query already has a shuffle stage, we are not shuffling data again. 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281918#comment-15281918
 ] 

Thejas M Nair commented on HIVE-13749:
--

Any ideas to what is causing them to be retained ?


> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13743:

Attachment: HIVE-13743.patch

This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api 
hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent 
path in 2.6, now throws FNFE.
[~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't return unit 
test for this since Hive currently uses 2.6 hadoop/

> Data move codepath is broken with hive (2.1.0-SNAPSHOT)
> ---
>
> Key: HIVE-13743
> URL: https://issues.apache.org/jira/browse/HIVE-13743
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: HIVE-13743.patch
>
>
> Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop 
> 2.8.0-snapshot.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path 
> not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333)
> at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy30.getEZForPath(Unknown 
> Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/
> ...
> ...
> ...
> 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to 
> move source 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002
>  to destination 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> {noformat}
> https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836
> hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing 
> FileNotFoundException as the destf is not present yet.  This causes moveFile 
> to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13743:

Assignee: Ashutosh Chauhan
  Status: Patch Available  (was: Open)

> Data move codepath is broken with hive (2.1.0-SNAPSHOT)
> ---
>
> Key: HIVE-13743
> URL: https://issues.apache.org/jira/browse/HIVE-13743
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13743.patch
>
>
> Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop 
> 2.8.0-snapshot.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path 
> not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333)
> at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy30.getEZForPath(Unknown 
> Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/
> ...
> ...
> ...
> 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to 
> move source 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002
>  to destination 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> {noformat}
> https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836
> hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing 
> FileNotFoundException as the destf is not present yet.  This causes moveFile 
> to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13743) Data move codepath is broken with hive (2.1.0-SNAPSHOT)

2016-05-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281908#comment-15281908
 ] 

Ashutosh Chauhan edited comment on HIVE-13743 at 5/12/16 6:50 PM:
--

This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api 
hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent 
path in 2.6, now throws FNFE.
[~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't write unit test 
for this since Hive currently uses 2.6 hadoop


was (Author: ashutoshc):
This is because of change in behavior of HDFS from 2.6 to 2.8 wherein api 
hdfsAdmin.getEncryptionZoneForPath(path) used to return null for non-existent 
path in 2.6, now throws FNFE.
[~rajesh.balamohan] Can you test this out in 2.8 cluster? Can't return unit 
test for this since Hive currently uses 2.6 hadoop/

> Data move codepath is broken with hive (2.1.0-SNAPSHOT)
> ---
>
> Key: HIVE-13743
> URL: https://issues.apache.org/jira/browse/HIVE-13743
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13743.patch
>
>
> Data move codepath is broken with hive 2.1.0-SNAPSHOT with hadoop 
> 2.8.0-snapshot.
> {noformat}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path 
> not found: /apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getEZForPath(FSDirEncryptionZoneOp.java:178)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:7336)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1973)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1376)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:645)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2339)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2335)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2333)
> at org.apache.hadoop.ipc.Client.call(Client.java:1448)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy30.getEZForPath(Unknown 
> Source)/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/
> ...
> ...
> ...
> 2016-05-11T09:40:43,760 ERROR [main]: ql.Driver (:()) - FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to 
> move source 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/.hive-staging_hive_2016-05-11_09-40-42_489_5056654133706433454-1/-ext-10002
>  to destination 
> hdfs://xyz:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_1.db/date_dim1
> {noformat}
> https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L2836
> hdfsEncryptionShim.isPathEncrypted(destf) in Hive could end up throwing 
> FileNotFoundException as the destf is not present yet.  This causes moveFile 
> to fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281880#comment-15281880
 ] 

Eugene Koifman commented on HIVE-13249:
---

what I meant is
{noformat}
public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException {
384 if (openTxnsCounter == null) {
synchronzied(TxnHandler.class) {
385   try {
if (openTxnsCounter == null) {
386 startHouseKeeperService(conf, 
Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService"));
}
387   } catch (Exception e) {
388 throw new MetaException(e.getMessage());
389   }
}
390 }
{noformat}
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html


One more thing:
startHouseKeeperService() catches Exception and logs but when openTxns() calls 
startHouseKeeperService() it catches and rethrows.

Seems contradictory.  Did you want to fail all txns if this service is not 
available or make best effort?


> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13249:
-
Attachment: HIVE-13249.9.patch

Thanks for catching that. Patch 9 fixed it.

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, HIVE-13249.9.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281868#comment-15281868
 ] 

Eugene Koifman commented on HIVE-13249:
---

{noformat}
public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException {
384 if (openTxnsCounter == null) {
385   try {
386 startHouseKeeperService(conf, 
Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService"));
387   } catch (Exception e) {
388 throw new MetaException(e.getMessage());
389   }
390 }
{noformat}

this is not thread safe.  concurrent openTxns() can create multiple instances 
of AcidOpenTxnsCounterService
{noformat}
public OpenTxnsResponse openTxns(OpenTxnRequest rqst) throws MetaException {
384 if (openTxnsCounter == null) {
synchronzied(TxnHandler.class) {
385   try {
386 startHouseKeeperService(conf, 
Class.forName("org.apache.hadoop.hive.ql.txn.AcidOpenTxnsCounterService"));
387   } catch (Exception e) {
388 throw new MetaException(e.getMessage());
389   }
}
390 }
{noformat}
would work

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11417) Create shims for the row by row read path that is backed by VectorizedRowBatch

2016-05-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281850#comment-15281850
 ] 

Prasanth Jayachandran commented on HIVE-11417:
--

Changes lgtm, +1

> Create shims for the row by row read path that is backed by VectorizedRowBatch
> --
>
> Key: HIVE-11417
> URL: https://issues.apache.org/jira/browse/HIVE-11417
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.1.0
>
> Attachments: HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, 
> HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch
>
>
> I'd like to make the default path for reading and writing ORC files to be 
> vectorized. To ensure that Hive can still read row by row, we'll need shims 
> to support the old API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions

2016-05-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13249:
-
Attachment: HIVE-13249.8.patch

Upload patch 8.

Made maxOpenTxns, numOpenTxns, tooManyOpenTxns volatile;
Changed LOG.warn to LOG.error;
Removed OpenTxnsCounter from MUTEX_KEY;
Moved OpenTxnsCounter housekeeper service startup logic from HiveMetaStore to 
TxnHandler.openTxns.

[~ekoifman] Could you please review?

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13621) compute stats in certain cases fails with NPE

2016-05-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13621:
---
Fix Version/s: 2.1.0

> compute stats in certain cases fails with NPE
> -
>
> Key: HIVE-13621
> URL: https://issues.apache.org/jira/browse/HIVE-13621
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore, Metastore
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Vikram Dixit K
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch
>
>
> {code}
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13621) compute stats in certain cases fails with NPE

2016-05-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13621:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> compute stats in certain cases fails with NPE
> -
>
> Key: HIVE-13621
> URL: https://issues.apache.org/jira/browse/HIVE-13621
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore, Metastore
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Vikram Dixit K
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch
>
>
> {code}
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13621) compute stats in certain cases fails with NPE

2016-05-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281785#comment-15281785
 ] 

Pengcheng Xiong commented on HIVE-13621:


reran all those Spark tests, none of them fail. Also check the other failures 
and they are unrelated. Pushed to master. Thanks [~vikram.dixit] and 
[~hagleitn] for the review and comments!

> compute stats in certain cases fails with NPE
> -
>
> Key: HIVE-13621
> URL: https://issues.apache.org/jira/browse/HIVE-13621
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore, Metastore
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Vikram Dixit K
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13621.1.patch, HIVE-13621.2.patch
>
>
> {code}
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136)
>   at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11417) Create shims for the row by row read path that is backed by VectorizedRowBatch

2016-05-12 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-11417:
-
Attachment: HIVE-11417.patch

This patch:
* addresses the review comments from Prasanth
* fixes a test failure where the schema evolution code from HIVE-13178 didn't 
work properly for vectorized binary -> string conversion.

Note that jenkins doesn't seem to be handling the binary file 
orc-file-11-format.orc even though git included it in the patch as a binary 
diff, which explains the test failures that mention version 11 orc file.

> Create shims for the row by row read path that is backed by VectorizedRowBatch
> --
>
> Key: HIVE-11417
> URL: https://issues.apache.org/jira/browse/HIVE-11417
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.1.0
>
> Attachments: HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, 
> HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch, HIVE-11417.patch
>
>
> I'd like to make the default path for reading and writing ORC files to be 
> vectorized. To ensure that Hive can still read row by row, we'll need shims 
> to support the old API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13563) Hive Streaming does not honor orc.compress.size and orc.stripe.size table properties

2016-05-12 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281758#comment-15281758
 ] 

Owen O'Malley commented on HIVE-13563:
--

I think the ratio is better at 8 since we have the minor compaction set to run 
when there are 10 deltas, so 8 is a better match to the increase in work load.

How about:

ratio: 8
base compression 128k stripe 128mb
delta compression 16k stripe 16mb



> Hive Streaming does not honor orc.compress.size and orc.stripe.size table 
> properties
> 
>
> Key: HIVE-13563
> URL: https://issues.apache.org/jira/browse/HIVE-13563
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Attachments: HIVE-13563.1.patch
>
>
> According to the doc:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-HiveQLSyntax
> One should be able to specify tblproperties for many ORC options.
> But the settings for orc.compress.size and orc.stripe.size don't take effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore

2016-05-12 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-13749:
-
Attachment: Top_Consumers7.html

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13697) ListBucketing feature does not support uppercase string.

2016-05-12 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-13697:

Status: Patch Available  (was: In Progress)

ROOT-CAUSE:

toLowerCase() operator while getting skewed values from AST Node in
BaseSemanticAnalyzer. Hence Skewed Values are stored lower case only.

{code}
hive> desc formatted testskew2;
OK
# col_namedata_type   comment 

id  int 
a   string  

# Detailed Table Information  
Database:   default  
Owner:  hdfs 
CreateTime: Thu May 12 18:37:20 EEST 2016 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   hdfs:/user/hive/warehouse/testskew2 
Table Type: MANAGED_TABLE
Table Parameters:  
transient_lastDdlTime1463067440  

# Storage Information  
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:  
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat 
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Stored As SubDirectories:Yes  
Skewed Columns: [a]  
Skewed Values:  [[aus], [us]] < !!! ERROR !!!
Storage Desc Params:  
serialization.format1 
{code}

SOLUTION:

Remove unnecessary toLowerCase() operator.

> ListBucketing feature does not support uppercase string.
> 
>
> Key: HIVE-13697
> URL: https://issues.apache.org/jira/browse/HIVE-13697
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 1.2.1
> Environment: 1.2.1
>Reporter: Hao Zhu
>Assignee: Oleksiy Sayankin
>Priority: Critical
> Attachments: HIVE-13697.1.patch
>
>
> This is the feature:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> 1. Good example:
> {code}
> CREATE TABLE testskew (id INT, a STRING)
> SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew 
>  SELECT 123,'abc' FROM dual
>  union all
>  SELECT 123,'xyz' FROM dual
>  union all
>  SELECT 123,'others' FROM dual;
> {code}
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew
> Found 3 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=abc
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=xyz
> {code}
> This is good, because both "abc" and "xyz" directories got created.
> 2. Bad example -- This is the issue
> {code}
> CREATE TABLE testskew2 (id INT, a STRING)
> SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew2 
>  SELECT 123, 'aus' FROM dual
>  union all
>  SELECT 123, 'US' FROM dual
>  union all
>  SELECT 123, 'others' FROM dual;
> {code}
> You can see, only "aus" directory got created...
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew2
> Found 2 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/a=aus
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13697) ListBucketing feature does not support uppercase string.

2016-05-12 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-13697:

Attachment: HIVE-13697.1.patch

> ListBucketing feature does not support uppercase string.
> 
>
> Key: HIVE-13697
> URL: https://issues.apache.org/jira/browse/HIVE-13697
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 1.2.1
> Environment: 1.2.1
>Reporter: Hao Zhu
>Assignee: Oleksiy Sayankin
>Priority: Critical
> Attachments: HIVE-13697.1.patch
>
>
> This is the feature:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> 1. Good example:
> {code}
> CREATE TABLE testskew (id INT, a STRING)
> SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew 
>  SELECT 123,'abc' FROM dual
>  union all
>  SELECT 123,'xyz' FROM dual
>  union all
>  SELECT 123,'others' FROM dual;
> {code}
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew
> Found 3 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=abc
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=xyz
> {code}
> This is good, because both "abc" and "xyz" directories got created.
> 2. Bad example -- This is the issue
> {code}
> CREATE TABLE testskew2 (id INT, a STRING)
> SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew2 
>  SELECT 123, 'aus' FROM dual
>  union all
>  SELECT 123, 'US' FROM dual
>  union all
>  SELECT 123, 'others' FROM dual;
> {code}
> You can see, only "aus" directory got created...
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew2
> Found 2 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/a=aus
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13697) ListBucketing feature does not support uppercase string.

2016-05-12 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13697 started by Oleksiy Sayankin.
---
> ListBucketing feature does not support uppercase string.
> 
>
> Key: HIVE-13697
> URL: https://issues.apache.org/jira/browse/HIVE-13697
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 1.2.1
> Environment: 1.2.1
>Reporter: Hao Zhu
>Assignee: Oleksiy Sayankin
>Priority: Critical
>
> This is the feature:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> 1. Good example:
> {code}
> CREATE TABLE testskew (id INT, a STRING)
> SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew 
>  SELECT 123,'abc' FROM dual
>  union all
>  SELECT 123,'xyz' FROM dual
>  union all
>  SELECT 123,'others' FROM dual;
> {code}
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew
> Found 3 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=abc
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=xyz
> {code}
> This is good, because both "abc" and "xyz" directories got created.
> 2. Bad example -- This is the issue
> {code}
> CREATE TABLE testskew2 (id INT, a STRING)
> SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew2 
>  SELECT 123, 'aus' FROM dual
>  union all
>  SELECT 123, 'US' FROM dual
>  union all
>  SELECT 123, 'others' FROM dual;
> {code}
> You can see, only "aus" directory got created...
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew2
> Found 2 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/a=aus
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9615) Provide limit context for storage handlers

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281674#comment-15281674
 ] 

Hive QA commented on HIVE-9615:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12700308/HIVE-9615.2.patch.txt

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/245/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/245/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-245/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-245/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   38797d2..64c96e1  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 38797d2 HIVE-13670 : Improve Beeline connect/reconnect semantics 
(Sushanth Sowmyan, reviewed by Thejas Nair)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregatePullUpConstantsRule.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveProjectFilterPullUpConstantsRule.java
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at 64c96e1 HIVE-13726 : Improve dynamic partition loading VI 
(Ashutosh Chauhan via Rui Li)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12700308 - PreCommit-HIVE-MASTER-Build

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt, HIVE-9615.2.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to storage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13068) Disable Hive ConstantPropagate optimizer when CBO has optimized the plan II

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281666#comment-15281666
 ] 

Hive QA commented on HIVE-13068:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803199/HIVE-13068.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 187 failed/errored test(s), 9983 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_7.q-tez_union_group_by.q-orc_merge9.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_subq_not_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_not_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_colstats_all_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropWhen
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_masking_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_recursive_dir
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs

[jira] [Updated] (HIVE-13726) Improve dynamic partition loading VI

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13726:

Fix Version/s: 2.1.0

> Improve dynamic partition loading VI
> 
>
> Key: HIVE-13726
> URL: https://issues.apache.org/jira/browse/HIVE-13726
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.1.0
>
> Attachments: HIVE-13726.2.patch, HIVE-13726.patch
>
>
> Parallelize deletes and other refactoring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13726) Improve dynamic partition loading VI

2016-05-12 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13726:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Did a QA run. No new failures because of the patch. Pushed to master. Thanks, 
Rui for the review.

> Improve dynamic partition loading VI
> 
>
> Key: HIVE-13726
> URL: https://issues.apache.org/jira/browse/HIVE-13726
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13726.2.patch, HIVE-13726.patch
>
>
> Parallelize deletes and other refactoring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13507) Improved logging for ptest

2016-05-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281584#comment-15281584
 ] 

Sergio Peña commented on HIVE-13507:


Those tests are not failing in the last completed build. I don't know what 
happened, but I'll continue watching the builds.

> Improved logging for ptest
> --
>
> Key: HIVE-13507
> URL: https://issues.apache.org/jira/browse/HIVE-13507
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.1.0
>
> Attachments: HIVE-13507.01.patch, HIVE-13507.02.patch
>
>
> NO PRECOMMIT TESTS
> Include information about batch runtimes, outlier lists, host completion 
> times, etc. Try identifying tests which cause the build to take a long time 
> while holding onto resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler

2016-05-12 Thread Reuben Kuhnert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-13696:
--
Status: Patch Available  (was: Open)

> Monitor fair-scheduler.xml and automatically update/validate jobs submitted 
> to fair-scheduler
> -
>
> Key: HIVE-13696
> URL: https://issues.apache.org/jira/browse/HIVE-13696
> Project: Hive
>  Issue Type: Improvement
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, 
> HIVE-13696.06.patch
>
>
> Ensure that jobs are placed into the correct queue according to 
> {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and 
> users should not be able to submit jobs to queues they do not have access to.
> This patch builds on the existing functionality in {{FairSchedulerShim}} to 
> route jobs to user-specific queue based on {{fair-scheduler.xml}} 
> configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In 
> addition to configuring job routing at session connect (current behavior), 
> the routing is validated per submission to yarn (when impersonation is off). 
> A {{FileSystemWatcher}} class is included to monitor changes in the 
> {{fair-scheduler.xml}} file (so updates are automatically reloaded when the 
> file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler

2016-05-12 Thread Reuben Kuhnert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-13696:
--
Attachment: HIVE-13696.06.patch

> Monitor fair-scheduler.xml and automatically update/validate jobs submitted 
> to fair-scheduler
> -
>
> Key: HIVE-13696
> URL: https://issues.apache.org/jira/browse/HIVE-13696
> Project: Hive
>  Issue Type: Improvement
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, 
> HIVE-13696.06.patch
>
>
> Ensure that jobs are placed into the correct queue according to 
> {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and 
> users should not be able to submit jobs to queues they do not have access to.
> This patch builds on the existing functionality in {{FairSchedulerShim}} to 
> route jobs to user-specific queue based on {{fair-scheduler.xml}} 
> configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In 
> addition to configuring job routing at session connect (current behavior), 
> the routing is validated per submission to yarn (when impersonation is off). 
> A {{FileSystemWatcher}} class is included to monitor changes in the 
> {{fair-scheduler.xml}} file (so updates are automatically reloaded when the 
> file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13696) Monitor fair-scheduler.xml and automatically update/validate jobs submitted to fair-scheduler

2016-05-12 Thread Reuben Kuhnert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-13696:
--
Status: Open  (was: Patch Available)

> Monitor fair-scheduler.xml and automatically update/validate jobs submitted 
> to fair-scheduler
> -
>
> Key: HIVE-13696
> URL: https://issues.apache.org/jira/browse/HIVE-13696
> Project: Hive
>  Issue Type: Improvement
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-13696.01.patch, HIVE-13696.02.patch, 
> HIVE-13696.06.patch
>
>
> Ensure that jobs are placed into the correct queue according to 
> {{fair-scheduler.xml}}. Jobs should be placed into the correct queue, and 
> users should not be able to submit jobs to queues they do not have access to.
> This patch builds on the existing functionality in {{FairSchedulerShim}} to 
> route jobs to user-specific queue based on {{fair-scheduler.xml}} 
> configuration (leveraging the Yarn {{QueuePlacementPolicy}} class). In 
> addition to configuring job routing at session connect (current behavior), 
> the routing is validated per submission to yarn (when impersonation is off). 
> A {{FileSystemWatcher}} class is included to monitor changes in the 
> {{fair-scheduler.xml}} file (so updates are automatically reloaded when the 
> file pointed to by {{yarn.scheduler.fair.allocation.file}} is changed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13293) Query occurs performance degradation after enabling parallel order by for Hive on Spark

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281488#comment-15281488
 ] 

Hive QA commented on HIVE-13293:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803166/HIVE-13293.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 68 failed/errored test(s), 9194 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-bucket_map_join_tez1.q-auto_sortmerge_join_16.q-skewjoin.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-schema_evol_orc_nonvec_mapwork_part.q-mapjoin_decimal.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - 
did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_transform.q-union_remove_7.q-date_udf.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_stats
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_ppr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join34
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_3
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote

[jira] [Assigned] (HIVE-13697) ListBucketing feature does not support uppercase string.

2016-05-12 Thread Oleksiy Sayankin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin reassigned HIVE-13697:
---

Assignee: Oleksiy Sayankin

> ListBucketing feature does not support uppercase string.
> 
>
> Key: HIVE-13697
> URL: https://issues.apache.org/jira/browse/HIVE-13697
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 1.2.1
> Environment: 1.2.1
>Reporter: Hao Zhu
>Assignee: Oleksiy Sayankin
>Priority: Critical
>
> This is the feature:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> 1. Good example:
> {code}
> CREATE TABLE testskew (id INT, a STRING)
> SKEWED BY (a) ON ('abc', 'xyz') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew 
>  SELECT 123,'abc' FROM dual
>  union all
>  SELECT 123,'xyz' FROM dual
>  union all
>  SELECT 123,'others' FROM dual;
> {code}
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew
> Found 3 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=abc
> drwxrwxrwx   - mapr mapr  1 2016-05-05 14:56
> /user/hive/warehouse/testskew/a=xyz
> {code}
> This is good, because both "abc" and "xyz" directories got created.
> 2. Bad example -- This is the issue
> {code}
> CREATE TABLE testskew2 (id INT, a STRING)
> SKEWED BY (a) ON ('aus', 'US') STORED AS DIRECTORIES;
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
>  INSERT OVERWRITE TABLE testskew2 
>  SELECT 123, 'aus' FROM dual
>  union all
>  SELECT 123, 'US' FROM dual
>  union all
>  SELECT 123, 'others' FROM dual;
> {code}
> You can see, only "aus" directory got created...
> {code}
> # hadoop fs -ls /user/hive/warehouse/testskew2
> Found 2 items
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME
> drwxrwxrwx   - mapr mapr  1 2016-05-05 15:11
> /user/hive/warehouse/testskew2/a=aus
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13338:

Status: Patch Available  (was: In Progress)

Jump on the merry-go-round again.

> Differences in vectorized_casts.q output for vectorized and non-vectorized 
> runs
> ---
>
> Key: HIVE-13338
> URL: https://issues.apache.org/jira/browse/HIVE-13338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch
>
>
> Turn off vectorization and you get different results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13338:

Attachment: HIVE-13338.02.patch

> Differences in vectorized_casts.q output for vectorized and non-vectorized 
> runs
> ---
>
> Key: HIVE-13338
> URL: https://issues.apache.org/jira/browse/HIVE-13338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch
>
>
> Turn off vectorization and you get different results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13338:

Status: In Progress  (was: Patch Available)

> Differences in vectorized_casts.q output for vectorized and non-vectorized 
> runs
> ---
>
> Key: HIVE-13338
> URL: https://issues.apache.org/jira/browse/HIVE-13338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13338.01.patch
>
>
> Turn off vectorization and you get different results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10566) LLAP: Vector row extraction allocates new extractors per process method call instead of just once

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-10566.
-
   Resolution: Fixed
Fix Version/s: (was: 1.3.0)

> LLAP: Vector row extraction allocates new extractors per process method call 
> instead of just once
> -
>
> Key: HIVE-10566
> URL: https://issues.apache.org/jira/browse/HIVE-10566
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Extractors for unused columns (common for tables with many columns) are 
> created for each batch instead of just once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10566) LLAP: Vector row extraction allocates new extractors per process method call instead of just once

2016-05-12 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281414#comment-15281414
 ] 

Matt McCline commented on HIVE-10566:
-

No longer an issue after VecText enhancement.

> LLAP: Vector row extraction allocates new extractors per process method call 
> instead of just once
> -
>
> Key: HIVE-10566
> URL: https://issues.apache.org/jira/browse/HIVE-10566
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: 1.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Extractors for unused columns (common for tables with many columns) are 
> created for each batch instead of just once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12562) Enabling native fast hash table can cause incorrect results

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-12562.
-
  Resolution: Duplicate
Release Note: HIVE-13682

> Enabling native fast hash table can cause incorrect results
> ---
>
> Key: HIVE-12562
> URL: https://issues.apache.org/jira/browse/HIVE-12562
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Matt McCline
>
> Enabling "hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled" 
> causes incorrect results when running with LLAP.
> I believe this does not happen for simple container runs. However, it's 
> possible that caching of these tables, or using the same table more than once 
> causes issues - which may be seen with container reuse.
> The results vary by a small percentage.
> e.g. 82270, 82267 <- Two results for the same query run back to back.
> cc [~mmccline]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13245) VectorDeserializeRow throws IndexOutOfBoundsException

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-13245.
-
  Resolution: Duplicate
Release Note: HIVE-13682

> VectorDeserializeRow throws IndexOutOfBoundsException
> -
>
> Key: HIVE-13245
> URL: https://issues.apache.org/jira/browse/HIVE-13245
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>
> When running following query on TPCDS 1000 scale, VectorDeserializeRow threw 
> ArrayIndexOutOfBoundsException
> {code:title=Query}
> SELECT `customer_address`.`ca_zip`   AS `ca_zip`, 
>`customer_demographics`.`cd_education_status` AS 
> `cd_education_status`, 
>Sum(`store_sales`.`ss_net_paid`)  AS `SUM:SS_NET_PAID:ok` 
> FROM   `store_sales` `store_sales` 
>INNER JOIN `customer` `customer` 
>ON ( `store_sales`.`ss_customer_sk` = 
>   `customer`.`c_customer_sk` ) 
>INNER JOIN `customer_address` `customer_address` 
>ON ( `customer`.`c_current_addr_sk` = 
>   `customer_address`.`ca_address_sk` ) 
>INNER JOIN `customer_demographics` `customer_demographics` 
>ON ( `customer`.`c_current_cdemo_sk` = 
> `customer_demographics`.`cd_demo_sk` ) 
> WHERE  ( `customer`.`c_first_sales_date_sk` > 2452300 
>  AND `customer_demographics`.`cd_gender` = 'F' 
>  AND `customer`.`c_current_addr_sk` IS NOT NULL 
>  AND `store_sales`.`ss_sold_date_sk` IS NOT NULL 
>  AND `customer`.`c_current_cdemo_sk` IS NOT NULL ) 
> GROUP  BY `ca_zip`, 
>   `cd_education_status`;
> {code}
> {code:title=Exception}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:354)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:59)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:59)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:356)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:62)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
>   ... 17 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:392)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:143)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:121)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> 

[jira] [Resolved] (HIVE-12896) IndexArrayOutOfBoundsException during vectorized map join

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-12896.
-
  Resolution: Duplicate
Release Note: HIVE-13682

> IndexArrayOutOfBoundsException during vectorized map join
> -
>
> Key: HIVE-12896
> URL: https://issues.apache.org/jira/browse/HIVE-12896
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Jason Dere
>Assignee: Gopal V
> Attachments: HIVE-12896.tar.gz, query.explain.txt
>
>
> Trying a simple join on a couple of the TPCDS tables. Query works with 
> vectorization disabled.
> {noformat}
>  select c_customer_sk, c_customer_id from 
> tpcds_bin_partitioned_orc_10.customer, 
> tpcds_bin_partitioned_orc_10.customer_demographics where c_current_cdemo_sk = 
> cd_demo_sk limit 20
> {noformat}
> {noformat}
> ], TaskAttempt 3 failed, info=[Error: Failure while running task: 
> attempt_1448429572030_8225_4_01_03_3:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:351)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:59)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:59)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:354)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row 
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
>   ... 17 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:115)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:852)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:114)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:168)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>   ... 18 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:345)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:684)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:183)
>   at 
> 

[jira] [Commented] (HIVE-13682) EOFException with fast hashtable

2016-05-12 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281412#comment-15281412
 ] 

Matt McCline commented on HIVE-13682:
-

More extensive Fast Hash Table Unit Tests uncovered problems in reading the 
lengths of following Fast Hash Map records.
See VectorMapJoinFastValueStore.

And, a few minor issues with maintaining keysAssigned counter.

Lots of new Unit Tests for Fast SerializeWrite/DeserializeRead.

> EOFException with fast hashtable
> 
>
> Key: HIVE-13682
> URL: https://issues.apache.org/jira/browse/HIVE-13682
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
> Attachments: HIVE-13682.01.patch
>
>
> While testing something else on recent master, w/Tez 0.8.3, this happened 
> (TPCDS q27)
> {noformat}
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
>   ... 5 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
>   ... 9 more
> {noformat}
> There's no error if fast hashtable is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13682) EOFException with fast hashtable

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13682:

Attachment: HIVE-13682.01.patch

> EOFException with fast hashtable
> 
>
> Key: HIVE-13682
> URL: https://issues.apache.org/jira/browse/HIVE-13682
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
> Attachments: HIVE-13682.01.patch
>
>
> While testing something else on recent master, w/Tez 0.8.3, this happened 
> (TPCDS q27)
> {noformat}
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
>   ... 5 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
>   ... 9 more
> {noformat}
> There's no error if fast hashtable is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13682) EOFException with fast hashtable

2016-05-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13682:

Status: Patch Available  (was: Open)

> EOFException with fast hashtable
> 
>
> Key: HIVE-13682
> URL: https://issues.apache.org/jira/browse/HIVE-13682
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
> Attachments: HIVE-13682.01.patch
>
>
> While testing something else on recent master, w/Tez 0.8.3, this happened 
> (TPCDS q27)
> {noformat}
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:399)
>   ... 20 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache$1.call(LlapObjectCache.java:131)
>   ... 4 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.EOFException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:106)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
>   ... 5 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableDeserializeRead.readCheckNull(BinarySortableDeserializeRead.java:182)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:83)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:181)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:98)
>   ... 9 more
> {noformat}
> There's no error if fast hashtable is disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13621) compute stats in certain cases fails with NPE

2016-05-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281366#comment-15281366
 ] 

Hive QA commented on HIVE-13621:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12803154/HIVE-13621.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 66 failed/errored test(s), 9215 tests 
executed
*Failed tests:*
{noformat}
TestCompactor - did not produce a TEST-*.xml file
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more
 - did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoinopt15.q-join39.q-avro_joins_native.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_avro_decimal_native
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_column_access_stats
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby1_map_skew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby6_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_position
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join30
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_array
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_reorder3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf_matchpath
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_percentile
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union31
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union34
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_groupby_3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_2
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks
org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf
org.apache.hadoop.hive.metastore.TestHiveMetaStoreWithEnvironmentContext.testEnvironmentContext
org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics

[jira] [Updated] (HIVE-13747) NullPointerException thrown by Executors causes job can't be finished

2016-05-12 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HIVE-13747:
-
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-7292

> NullPointerException thrown by Executors causes job can't be finished
> -
>
> Key: HIVE-13747
> URL: https://issues.apache.org/jira/browse/HIVE-13747
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Walter Su
>
> stderr log from one executor.
> {noformat}
> 16/05/12 15:56:51 INFO exec.MapJoinOperator: Initializing operator MAPJOIN[10]
> 16/05/12 15:56:51 INFO exec.CommonJoinOperator: JOIN 
> struct<_col0:int,_col1:string,_col2:int,_col3:string> totalsz = 4
> 16/05/12 15:56:51 INFO spark.HashTableLoader: *** Load from HashTable for 
> input file: hdfs://test-cluster/user/hive/warehouse-store2/pokes/kv1.txt
> 16/05/12 15:56:51 INFO spark.HashTableLoader: Load back all hashtable 
> files from tmp folder 
> uri:hdfs://test-cluster/tmp/hive/hadoop/4062fcea-6759-4340-b4be-5e83181e68bf/hive_2016-05-12_15-56-50_196_4198620026582283764-1/-mr-10004/HashTable-Stage-1/MapJoin-mapfile11--.hashtable
> 16/05/12 15:56:51 INFO exec.MapJoinOperator: Exception loading hash tables. 
> Clearing partially loaded hash table containers.
> 16/05/12 15:56:51 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 
> (TID 3)
> java.lang.RuntimeException: Map operator initialization failed: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:121)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:57)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieveAsync(ObjectCache.java:63)
>   at 
> org.apache.hadoop.hive.ql.exec.ObjectCacheWrapper.retrieveAsync(ObjectCacheWrapper.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:173)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:112)
>   ... 15 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:151)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:299)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:180)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:176)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ObjectCache.retrieve(ObjectCache.java:55)
>   ... 23 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.isDedicatedCluster(SparkUtilities.java:118)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:158)
>   at 
> 

[jira] [Updated] (HIVE-13746) Data duplication when insert overwrite

2016-05-12 Thread Bill Wailliam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Wailliam updated HIVE-13746:
-
Description: Data duplication when insert overwrite .The old data cannot be 
deleted

> Data duplication when insert overwrite 
> ---
>
> Key: HIVE-13746
> URL: https://issues.apache.org/jira/browse/HIVE-13746
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Bill Wailliam
>Priority: Critical
>
> Data duplication when insert overwrite .The old data cannot be deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13370) Add test for HIVE-11470

2016-05-12 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-13370:

Status: Open  (was: Patch Available)

> Add test for HIVE-11470
> ---
>
> Key: HIVE-13370
> URL: https://issues.apache.org/jira/browse/HIVE-13370
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-13370.patch
>
>
> HIVE-11470 added capability to handle NULL dynamic partitioning keys 
> properly. However, it did not add a test for the case, we should have one so 
> we don't have future regressions of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >