[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-06-04 Thread Richard Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573694#comment-14573694
 ] 

Richard Williams commented on HIVE-10410:
-

[~ctang.ma] Specifically, we have found that this issue occurs whenever 
numerous JDBC clients (or rather, numerous clients that set the runAsync flag 
in TExecuteStatementReq to true, as JDBC does) are executing queries against 
HiveServer2 concurrently, as that is what causes multiple threads in the async 
execution thread pool to use their shared MetaStoreClient at once. 

I'll go ahead and regenerate the patch we've been running based on the Hive 
trunk. It's very simplistic--it just removes the code that sets the Hive 
objects in the pooled threads to the Hive object from the calling thread. As 
for the shared SessionState and HiveConf, those are suspicious as well, and 
might be causing other problems; however, since we began patching HiveServer2 
to prevent the sharing of the Hive object, this particular issue has 
disappeared for us.

 Apparent race condition in HiveServer2 causing intermittent query failures
 --

 Key: HIVE-10410
 URL: https://issues.apache.org/jira/browse/HIVE-10410
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
 Environment: CDH 5.3.3
 CentOS 6.4
Reporter: Richard Williams

 On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
 occasionally trigger odd Thrift exceptions with messages such as Read a 
 negative frame size (-2147418110)! or out of sequence response in 
 HiveServer2's connections to the metastore. For certain metastore calls (for 
 example, showDatabases), these Thrift exceptions are converted to 
 MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
 from retrying these calls and thus causes the failure to bubble out to the 
 JDBC client.
 Note that as far as we can tell, this issue appears to only affect queries 
 that are submitted with the runAsync flag on TExecuteStatementReq set to true 
 (which, in practice, seems to mean all JDBC queries), and it appears to only 
 manifest when HiveServer2 is using the new HTTP transport mechanism. When 
 both these conditions hold, we are able to fairly reliably reproduce the 
 issue by spawning about 100 simple, concurrent hive queries (we have been 
 using show databases), two or three of which typically fail. However, when 
 either of these conditions do not hold, we are no longer able to reproduce 
 the issue.
 Some example stack traces from the HiveServer2 logs:
 {noformat}
 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException Read a negative frame size 
 (-2147418110)!
 org.apache.thrift.transport.TTransportException: Read a negative frame size 
 (-2147418110)!
 at 
 org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
 at 
 org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
 at 
 org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
 at 
 org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
 at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
 at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
 at 

[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false

2015-06-04 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10551:
--
Assignee: (was: Matt McCline)

 OOM when running query_89 with vectorization on  hybridgrace=false
 ---

 Key: HIVE-10551
 URL: https://issues.apache.org/jira/browse/HIVE-10551
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: HIVE-10551-explain-plan.log, hive-10551.png, 
 hive_10551.png


 - TPC-DS Query_89 @ 10 TB scale
 - Trunk version of Hive + Tez 0.7.0-SNAPSHOT
 - Additional settings ( hive.vectorized.groupby.maxentries=1024 , 
 tez.runtime.io.sort.factor=200  tez.runtime.io.sort.mb=1800 
 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false )
 Will attach the profiler snapshot asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false

2015-06-04 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10551:
--
Assignee: Matt McCline  (was: Vikram Dixit K)

 OOM when running query_89 with vectorization on  hybridgrace=false
 ---

 Key: HIVE-10551
 URL: https://issues.apache.org/jira/browse/HIVE-10551
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Matt McCline
 Attachments: HIVE-10551-explain-plan.log, hive-10551.png, 
 hive_10551.png


 - TPC-DS Query_89 @ 10 TB scale
 - Trunk version of Hive + Tez 0.7.0-SNAPSHOT
 - Additional settings ( hive.vectorized.groupby.maxentries=1024 , 
 tez.runtime.io.sort.factor=200  tez.runtime.io.sort.mb=1800 
 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false )
 Will attach the profiler snapshot asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573932#comment-14573932
 ] 

Hive QA commented on HIVE-10910:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737742/HIVE-10910.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9002 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4182/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4182/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4182/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737742 - PreCommit-HIVE-TRUNK-Build

 Alter table drop partition queries in encrypted zone failing to remove data 
 from HDFS
 -

 Key: HIVE-10910
 URL: https://issues.apache.org/jira/browse/HIVE-10910
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Eugene Koifman
 Attachments: HIVE-10910.patch


 Alter table query trying to drop partition removes metadata of partition but 
 fails to remove the data from HDFS
 hive create table table_1(name string, age int, gpa double) partitioned by 
 (b string) stored as textfile;
 OK
 Time taken: 0.732 seconds
 hive alter table table_1 add partition (b='2010-10-10');
 OK
 Time taken: 0.496 seconds
 hive show partitions table_1;
 OK
 b=2010-10-10
 Time taken: 0.781 seconds, Fetched: 1 row(s)
 hive alter table table_1 drop partition (b='2010-10-10');
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
 Failed to move to trash: 
 hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10
 hive show partitions table_1;
 OK
 Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-04 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573962#comment-14573962
 ] 

Ferdinand Xu commented on HIVE-10943:
-

Hi [~xuefuz], is there anything else need to be done for enabling the precommit?

 Beeline-cli: Enable precommit for beelie-cli branch 
 

 Key: HIVE-10943
 URL: https://issues.apache.org/jira/browse/HIVE-10943
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor

 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10872) LLAP: make sure tests pass

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573866#comment-14573866
 ] 

Hive QA commented on HIVE-10872:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737685/HIVE-10872.03.patch

{color:red}ERROR:{color} -1 due to 424 failed/errored test(s), 8732 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketizedhiveinputformat_auto
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_nullsafe_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_varchar_mapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_nested_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestCompareCliDriver.testCompareCliDriver_llap_0
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization

[jira] [Updated] (HIVE-10939) Make TestFileDump robust

2015-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10939:

Attachment: (was: HIVE-10939.patch)

 Make TestFileDump robust
 

 Key: HIVE-10939
 URL: https://issues.apache.org/jira/browse/HIVE-10939
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10939.patch


 It fails on Windows OS currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10939) Make TestFileDump robust

2015-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10939:

Attachment: HIVE-10939.patch

 Make TestFileDump robust
 

 Key: HIVE-10939
 URL: https://issues.apache.org/jira/browse/HIVE-10939
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10939.patch


 It fails on Windows OS currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-04 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10941:
-
Attachment: HIVE-10941.1.patch

[~sushanth] Can you please take a look at this patch

Thanks
Hari

 Provide option to disable spark tests outside itests
 

 Key: HIVE-10941
 URL: https://issues.apache.org/jira/browse/HIVE-10941
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10941.1.patch


 HIVE-10477 provided an option to disable spark module, however we missed the 
 following files that are outside itests directory. i.e we need to club the 
 option with disabling the following tests as well :
 {code}
 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
 org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
 org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
 org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests

2015-06-04 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10941:
-
Description: 
HIVE-10477 provided an option to disable spark module, however we missed the 
following files that are outside itests directory. i.e we need to club the 
option with disabling the following tests as well :
{code}
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{code}

  was:
HIVE-10477 provided an option to disable spark module, however we missed the 
following files that are outside itests directory. i.e we need to club the 
option with disabling the following tests as well :
{code}
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
The above tests need to be disabled.
{code}


 Provide option to disable spark tests outside itests
 

 Key: HIVE-10941
 URL: https://issues.apache.org/jira/browse/HIVE-10941
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 HIVE-10477 provided an option to disable spark module, however we missed the 
 following files that are outside itests directory. i.e we need to club the 
 option with disabling the following tests as well :
 {code}
 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
 org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
 org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery
 org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()

2015-06-04 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573956#comment-14573956
 ] 

Chaoyu Tang commented on HIVE-10933:


I could not reproduce your issue in trunk and believe it has been resolved by 
HIVE-5847 since Hive 0.14. Could you try again in Hive 1.2?

 Hive 0.13 returns precision 0 for varchar(32) from 
 DatabaseMetadata.getColumns()
 

 Key: HIVE-10933
 URL: https://issues.apache.org/jira/browse/HIVE-10933
 Project: Hive
  Issue Type: Bug
  Components: API
Affects Versions: 0.13.0
Reporter: Son Nguyen
Assignee: Chaoyu Tang

 DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined 
 as varchar(32), or char(32).   While ResultSetMetaData.getPrecision() returns 
 correct value 32.
 Here is the segment program that reproduces the issue.
 try {
   statement = connection.createStatement();
   
   statement.execute(drop table if exists son_table);
   
   statement.execute(create table son_table( col1 varchar(32) ));
   
   statement.close();
   
 } catch ( Exception e) {
  return;
 } 
   
 // get column info using metadata
 try {
   DatabaseMetaData dmd = null;
   ResultSet resultSet = null;
   
   dmd = connection.getMetaData();
   
   resultSet = dmd.getColumns(null, null, son_table, col1);
   
   if ( resultSet.next() ) {
   String tabName = resultSet.getString(TABLE_NAME);
   String colName = resultSet.getString(COLUMN_NAME);
   String dataType = resultSet.getString(DATA_TYPE);
   String typeName = resultSet.getString(TYPE_NAME);
   int precision = resultSet.getInt(COLUMN_SIZE);
   
   // output is: colName = col1, dataType = 12, typeName = 
 VARCHAR, precision = 0.
 System.out.format(colName = %s, dataType = %s, typeName = %s, 
 precision = %d.,
   colName, dataType, typeName, precision);
   }
 } catch ( Exception e) {
   return;
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10939) Make TestFileDump robust

2015-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10939:

Attachment: HIVE-10939.patch

[~hsubramaniyan] Can you please review this ?


 Make TestFileDump robust
 

 Key: HIVE-10939
 URL: https://issues.apache.org/jira/browse/HIVE-10939
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10939.patch


 It fails on Windows OS currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-04 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573934#comment-14573934
 ] 

Eugene Koifman commented on HIVE-10910:
---

the failures are not related

 Alter table drop partition queries in encrypted zone failing to remove data 
 from HDFS
 -

 Key: HIVE-10910
 URL: https://issues.apache.org/jira/browse/HIVE-10910
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Eugene Koifman
 Attachments: HIVE-10910.patch


 Alter table query trying to drop partition removes metadata of partition but 
 fails to remove the data from HDFS
 hive create table table_1(name string, age int, gpa double) partitioned by 
 (b string) stored as textfile;
 OK
 Time taken: 0.732 seconds
 hive alter table table_1 add partition (b='2010-10-10');
 OK
 Time taken: 0.496 seconds
 hive show partitions table_1;
 OK
 b=2010-10-10
 Time taken: 0.781 seconds, Fetched: 1 row(s)
 hive alter table table_1 drop partition (b='2010-10-10');
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
 Failed to move to trash: 
 hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10
 hive show partitions table_1;
 OK
 Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values

2015-06-04 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10929:
--
Attachment: HIVE-10929.2.patch

 In Tez mode,dynamic partitioning query with union all fails at 
 moveTask,Invalid partition key  values
 --

 Key: HIVE-10929
 URL: https://issues.apache.org/jira/browse/HIVE-10929
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch


 {code}
 create table dummy(i int);
 insert into table dummy values (1);
 select * from dummy;
 create table partunion1(id1 int) partitioned by (part1 string);
 set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.execution.engine=tez;
 explain insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 select * from partunion1;
 {code}
 fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command

2015-06-04 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-6791:
---
Attachment: HIVE-6791-beeline-cli.patch

Hi [~xuefuz], [~chinnalalam], could you review the patch? 
I have created HIVE-10943 to see whether it breaks other functionality.
Thank you!

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-6791-beeline-cli.patch


 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10939) Make TestFileDump robust

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573875#comment-14573875
 ] 

Gunther Hagleitner commented on HIVE-10939:
---

+1

 Make TestFileDump robust
 

 Key: HIVE-10939
 URL: https://issues.apache.org/jira/browse/HIVE-10939
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10939.patch


 It fails on Windows OS currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch

2015-06-04 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10943:

Attachment: HIVE-10943.patch

 Beeline-cli: Enable precommit for beelie-cli branch 
 

 Key: HIVE-10943
 URL: https://issues.apache.org/jira/browse/HIVE-10943
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10943.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10935) LLAP: merge master to branch

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10935.
-
   Resolution: Fixed
Fix Version/s: llap

Done. Needed this for recent commits and for test patch

 LLAP: merge master to branch
 

 Key: HIVE-10935
 URL: https://issues.apache.org/jira/browse/HIVE-10935
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573687#comment-14573687
 ] 

Sergey Shelukhin commented on HIVE-10761:
-

Hello. Immediately after integrating this, I am getting a non-stop stream of 
NPEs (several a second) in the log when running HS2:
{noformat}
2015-06-04 15:17:13,648 WARN  
[org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor@6fca5907()]: 
common.JvmPauseMonitor (JvmPauseMonitor.java:incrementMetricsCounter(205)) - 
Error Reporting JvmPauseMonitor to Metrics system
java.lang.NullPointerException
at 
org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.incrementMetricsCounter(JvmPauseMonitor.java:203)
at 
org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:195)
at java.lang.Thread.run(Thread.java:745)
{noformat}

 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573620#comment-14573620
 ] 

Thejas M Nair commented on HIVE-10761:
--

[~szehon] If it is committed to only master then the fix version should be 
2.0.0, if its committed to branch-1 as well, the fix version should be 1.3.0 as 
well.



 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573621#comment-14573621
 ] 

Thejas M Nair commented on HIVE-10761:
--

[~szehon] If it is committed to only master then the fix version should be 
2.0.0, if its committed to branch-1 as well, the fix version should be 1.3.0 as 
well.



 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false

2015-06-04 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573716#comment-14573716
 ] 

Vikram Dixit K commented on HIVE-10551:
---

[~mmccline] for your reference.

 OOM when running query_89 with vectorization on  hybridgrace=false
 ---

 Key: HIVE-10551
 URL: https://issues.apache.org/jira/browse/HIVE-10551
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: HIVE-10551-explain-plan.log, hive-10551.png, 
 hive_10551.png


 - TPC-DS Query_89 @ 10 TB scale
 - Trunk version of Hive + Tez 0.7.0-SNAPSHOT
 - Additional settings ( hive.vectorized.groupby.maxentries=1024 , 
 tez.runtime.io.sort.factor=200  tez.runtime.io.sort.mb=1800 
 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false )
 Will attach the profiler snapshot asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10427:
--
Labels: TODOC2.0  (was: TODOC1.3)

 collect_list() and collect_set() should accept struct types as argument
 ---

 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm
Assignee: Chao Sun
  Labels: TODOC2.0
 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
 HIVE-10427.3.patch, HIVE-10427.4.patch


 The collect_list() and collect_set() functions currently only accept scalar 
 argument types. It would be very useful if these functions could also accept 
 struct argument types for creating nested data from flat data.
 For example, suppose I wanted to create a nested customers/orders table from 
 two flat tables, customers and orders. Then it'd be very convenient to write 
 something like this:
 {code}
 insert into table nested_customers_orders
 select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...))
 from customers c inner join orders o on (c.cid = o.oid)
 group by c.cid
 {code}
 Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10936) incorrect result set when hive.vectorized.execution.enabled = true with predicate casting to CHAR or VARCHAR

2015-06-04 Thread N Campbell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated HIVE-10936:
--
Attachment: GO_TIME_DIM.zip

 incorrect result set when hive.vectorized.execution.enabled = true with 
 predicate casting to CHAR or VARCHAR
 

 Key: HIVE-10936
 URL: https://issues.apache.org/jira/browse/HIVE-10936
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 0.14.0
 Environment: In this case using HDP install of Hive - 0.14.0.2.2.4.2-2
Reporter: N Campbell
 Attachments: GO_TIME_DIM.zip


 Query returns data when set hive.vectorized.execution.enabled = false -or- if 
 target of CAST is STRING and not CHAR/VARCHAR
 set hive.vectorized.execution.enabled = true;
 select 
   `GO_TIME_DIM`.`day_key`
 from 
   `gosalesdw1021`.`go_time_dim` `GO_TIME_DIM` 
 where 
   CAST(`GO_TIME_DIM`.`current_year` AS CHAR(4)) = '2010' 
 group by 
   `GO_TIME_DIM`.`day_key`;
 create table GO_TIME_DIM ( DAY_KEY int , DAY_DATE timestamp , MONTH_KEY int , 
 CURRENT_MONTH smallint , MONTH_NUMBER int , QUARTER_KEY int , CURRENT_QUARTER 
 smallint , CURRENT_YEAR smallint , DAY_OF_WEEK smallint , DAY_OF_MONTH 
 smallint , DAYS_IN_MONTH smallint , DAY_OF_YEAR smallint , WEEK_OF_MONTH 
 smallint , WEEK_OF_QUARTER smallint , WEEK_OF_YEAR smallint , MONTH_EN string 
 , WEEKDAY_EN string , MONTH_DE string , WEEKDAY_DE string , MONTH_FR string , 
 WEEKDAY_FR string , MONTH_JA string , WEEKDAY_JA string , MONTH_AR string , 
 WEEKDAY_AR string , MONTH_CS string , WEEKDAY_CS string , MONTH_DA string , 
 WEEKDAY_DA string , MONTH_EL string , WEEKDAY_EL string , MONTH_ES string , 
 WEEKDAY_ES string , MONTH_FI string , WEEKDAY_FI string , MONTH_HR string , 
 WEEKDAY_HR string , MONTH_HU string , WEEKDAY_HU string , MONTH_ID string , 
 WEEKDAY_ID string , MONTH_IT string , WEEKDAY_IT string , MONTH_KK string , 
 WEEKDAY_KK string , MONTH_KO string , WEEKDAY_KO string , MONTH_MS string , 
 WEEKDAY_MS string , MONTH_NL string , WEEKDAY_NL string , MONTH_NO string , 
 WEEKDAY_NO string , MONTH_PL string , WEEKDAY_PL string , MONTH_PT string , 
 WEEKDAY_PT string , MONTH_RO string , WEEKDAY_RO string , MONTH_RU string , 
 WEEKDAY_RU string , MONTH_SC string , WEEKDAY_SC string , MONTH_SL string , 
 WEEKDAY_SL string , MONTH_SV string , WEEKDAY_SV string , MONTH_TC string , 
 WEEKDAY_TC string , MONTH_TH string , WEEKDAY_TH string , MONTH_TR string , 
 WEEKDAY_TR string )
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE
 LOCATION '../GO_TIME_DIM';
 Then create an ORC equivalent table and load it
 insert overwrite table 
 GO_TIME_DIM
 select * from TEXT.GO_TIME_DIM
 ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10874) Fail in TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q due to duplicate column name

2015-06-04 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10874:
--
Fix Version/s: 1.2.1

 Fail in TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q due to 
 duplicate column name
 -

 Key: HIVE-10874
 URL: https://issues.apache.org/jira/browse/HIVE-10874
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.1

 Attachments: HIVE-10874.01.patch, HIVE-10874.patch


 Aggregate operators may derive row types with duplicate column names. The 
 reason is that the column names for grouping sets columns and aggregation 
 columns might be generated automatically, but we do not check whether the 
 column name already exists in the same row.
 This error can be reproduced by 
 TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q, which fails 
 with the following trace:
 {code}
 junit.framework.AssertionFailedError: Unexpected exception 
 java.lang.AssertionError: RecordType(BIGINT $f1, BIGINT $f1)
   at org.apache.calcite.rel.core.Project.isValid(Project.java:200)
   at org.apache.calcite.rel.core.Project.init(Project.java:85)
   at org.apache.calcite.rel.core.Project.init(Project.java:91)
   at 
 org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject.init(HiveProject.java:70)
   at 
 org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject.create(HiveProject.java:103)
   at 
 org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.introduceDerivedTable(PlanModifierForASTConv.java:211)
   at 
 org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.convertOpTree(PlanModifierForASTConv.java:67)
   at 
 org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:94)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:617)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:248)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10108)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
 ...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-06-04 Thread Richard Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Williams updated HIVE-10410:

Attachment: HIVE-10410.1.patch

 Apparent race condition in HiveServer2 causing intermittent query failures
 --

 Key: HIVE-10410
 URL: https://issues.apache.org/jira/browse/HIVE-10410
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
 Environment: CDH 5.3.3
 CentOS 6.4
Reporter: Richard Williams
 Attachments: HIVE-10410.1.patch


 On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
 occasionally trigger odd Thrift exceptions with messages such as Read a 
 negative frame size (-2147418110)! or out of sequence response in 
 HiveServer2's connections to the metastore. For certain metastore calls (for 
 example, showDatabases), these Thrift exceptions are converted to 
 MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
 from retrying these calls and thus causes the failure to bubble out to the 
 JDBC client.
 Note that as far as we can tell, this issue appears to only affect queries 
 that are submitted with the runAsync flag on TExecuteStatementReq set to true 
 (which, in practice, seems to mean all JDBC queries), and it appears to only 
 manifest when HiveServer2 is using the new HTTP transport mechanism. When 
 both these conditions hold, we are able to fairly reliably reproduce the 
 issue by spawning about 100 simple, concurrent hive queries (we have been 
 using show databases), two or three of which typically fail. However, when 
 either of these conditions do not hold, we are no longer able to reproduce 
 the issue.
 Some example stack traces from the HiveServer2 logs:
 {noformat}
 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException Read a negative frame size 
 (-2147418110)!
 org.apache.thrift.transport.TTransportException: Read a negative frame size 
 (-2147418110)!
 at 
 org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
 at 
 org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
 at 
 org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
 at 
 org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
 at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
 at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
 at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
 at 

[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository

2015-06-04 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573744#comment-14573744
 ] 

Anthony Hsu commented on HIVE-9664:
---

Thanks. Looks good.

 Hive add jar command should be able to download and add jars from a 
 repository
 

 Key: HIVE-9664
 URL: https://issues.apache.org/jira/browse/HIVE-9664
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Anant Nag
Assignee: Anant Nag
  Labels: TODOC1.2, hive, patch
 Fix For: 1.2.0

 Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, 
 HIVE-9664.patch, HIVE-9664.patch


 Currently Hive's add jar command takes a local path to the dependency jar. 
 This clutters the local file-system as users may forget to remove this jar 
 later
 It would be nice if Hive supported a Gradle like notation to download the jar 
 from a repository.
 Example:  add jar org:module:version
 
 It should also be backward compatible and should take jar from the local 
 file-system as well. 
 RB:  https://reviews.apache.org/r/31628/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10748) Replace StringBuffer with StringBuilder where possible

2015-06-04 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-10748:
---
Fix Version/s: 2.0.0

 Replace StringBuffer with StringBuilder where possible
 --

 Key: HIVE-10748
 URL: https://issues.apache.org/jira/browse/HIVE-10748
 Project: Hive
  Issue Type: Improvement
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-10748.1.patch, HIVE-10748.1.patch, 
 HIVE-10748.2.patch


 I found 40 places in Hive where new StringBuffer( is used.
 Where possible, it is recommended that StringBuilder be used in preference 
 to StringBuffer as it will be faster under most implementations
 https://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10937:

Fix Version/s: llap

 LLAP: make ObjectCache for plans work properly in the daemon
 

 Key: HIVE-10937
 URL: https://issues.apache.org/jira/browse/HIVE-10937
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap


 There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of 
 4Mb each.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573591#comment-14573591
 ] 

Hive QA commented on HIVE-10907:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737668/HIVE-10907.4.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8998 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4179/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737668 - PreCommit-HIVE-TRUNK-Build

 Hive on Tez: Classcast exception in some cases with SMB joins
 -

 Key: HIVE-10907
 URL: https://issues.apache.org/jira/browse/HIVE-10907
 Project: Hive
  Issue Type: Bug
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10907.1.patch, HIVE-10907.2.patch, 
 HIVE-10907.3.patch, HIVE-10907.4.patch


 In cases where there is a mix of Map side work and reduce side work, we get a 
 classcast exception because we assume homogeneity in the code. We need to fix 
 this correctly. For now this is a workaround.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573613#comment-14573613
 ] 

Mithun Radhakrishnan commented on HIVE-10761:
-

Hey, Sush, Szehon. I can confirm that Yahoo cares about HS2 metrics. :p 

I'm not familiar with codehale, but if it works with JMX, that's cool. Lemme do 
some homework. Thanks for the heads-up and the nifty addition, chaps.

 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573617#comment-14573617
 ] 

Mithun Radhakrishnan commented on HIVE-10761:
-

Question: Are we proposing to deprecate the old metrics system on trunk? What 
release are we considering deprecation and removal?

 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10427:
--
Labels: TODOC1.3  (was: )

 collect_list() and collect_set() should accept struct types as argument
 ---

 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm
Assignee: Chao Sun
  Labels: TODOC1.3
 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
 HIVE-10427.3.patch, HIVE-10427.4.patch


 The collect_list() and collect_set() functions currently only accept scalar 
 argument types. It would be very useful if these functions could also accept 
 struct argument types for creating nested data from flat data.
 For example, suppose I wanted to create a nested customers/orders table from 
 two flat tables, customers and orders. Then it'd be very convenient to write 
 something like this:
 {code}
 insert into table nested_customers_orders
 select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...))
 from customers c inner join orders o on (c.cid = o.oid)
 group by c.cid
 {code}
 Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-06-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573630#comment-14573630
 ] 

Alan Gates commented on HIVE-10165:
---

package.html:  
 * this is excellent documentation.  We may want to move much of this into the 
wiki for users.
 * Compactions are done by the metastore server, not HiveServer2.
 * Currently, when issuing queries on streaming tables, query client must set 
hive.input.format = org.apache.hadoop.hive.ql.io.HiveInputFormat 
hive.vectorized.execution.enabled = false The above client settings are a 
temporary requirement and the intention is to drop the need for them in the 
near future.  I don't believe either of those are true anymore (as of Hive 
0.14).
 
LockImpl:
 * internalAcquire: Why do you recreate the connection the metastore each time 
through the loop?  These seems expensive.  Same comment for building the lock 
request.  This shouldn't change as you go through the loop.
 * internalRelease: You've built in handling for releasing locks that are not 
part of transactions.  When you do envision users locking something that isn't 
part of a transaction?  Since this is doing write operations I would assume 
you'll always have a transaction.

MutatorDestination: This appears to be a simple struct that records data about 
a table, why have it as an interface with an impl?

TransactionImpl:
 * Why do commit() and abort() release the locks?  Since these locks are part 
of a transaction they will always be released when the transaction is committed 
or aborted.

MutatorClient:
 * Why is Lock external to this class?  It seems like Lock is a component of 
this class.  Or do you envision users using one Lock object to manage multiple 
MutatorClients? 

MutatorCoordinator:
 * Comments in the class javadoc: it's origTxnId, bucketid that controls the 
ordering, not lastTxnId, since origTxnId is immutable.
 * In the constructor, why are you passing in CreatePartitionHelper and 
SequenceValidator when there's only one instance of these?
 * resetMutator, this code is closing the Mutator everytime you switch 
Mutators.  But if I understand correctly this is going to result in writing a 
footer in the ORC file.  You're going to end up with a thousand tiny stripes in 
your files.  That is not what you want.  You do need to make sure you don't 
have too many open at a time to avoids OOMs and too many file handles open 
errors.  But you'll need to keep a list of which ones are open and then close 
them on an LRU basis (or maybe pick the one with the most records since it will 
give you the best stripe size) as you need to open more rather than closing 
each one each time.  [~owen.omalley] comments?

CreationPartitionHelper:
 * createPartitionIfNotExists: Why are you running the Driver class here?  Why 
not call IMetaStoreClient.addPartition()?  That would be much lighter weight.

Hive doesn't currently have a deadlock detector.  ([~ekoifman] is working on 
fixing this as part of HIVE-9675).  The way this is written it could deadlock 
with other stream writers or with SQL users.  This code will eventually recover 
since it only tries to lock so many times and then gives up.  I'm not sure 
there's anything to do about this for now, but it should be documented.

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
 HIVE-10165.5.patch, mutate-system-overview.png


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of 

[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573788#comment-14573788
 ] 

Lefty Leverenz commented on HIVE-10427:
---

Doc note:  Adding TODOC2.0 label since this was committed to master today.  If 
it is also committed to branch-1, please replace TODOC2.0 with TODOC1.3.

Documentation for collect_list() and collect_set() is in the UDAF section on 
the UDFs page:

* [Built-in Aggregate Functions (UDAF) | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]

 collect_list() and collect_set() should accept struct types as argument
 ---

 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm
Assignee: Chao Sun
  Labels: TODOC2.0
 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
 HIVE-10427.3.patch, HIVE-10427.4.patch


 The collect_list() and collect_set() functions currently only accept scalar 
 argument types. It would be very useful if these functions could also accept 
 struct argument types for creating nested data from flat data.
 For example, suppose I wanted to create a nested customers/orders table from 
 two flat tables, customers and orders. Then it'd be very convenient to write 
 something like this:
 {code}
 insert into table nested_customers_orders
 select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...))
 from customers c inner join orders o on (c.cid = o.oid)
 group by c.cid
 {code}
 Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10910:
--
Attachment: HIVE-10910.patch

[~sushanth] or [~hagleitn] could you review please?

 Alter table drop partition queries in encrypted zone failing to remove data 
 from HDFS
 -

 Key: HIVE-10910
 URL: https://issues.apache.org/jira/browse/HIVE-10910
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Eugene Koifman
 Attachments: HIVE-10910.patch


 Alter table query trying to drop partition removes metadata of partition but 
 fails to remove the data from HDFS
 hive create table table_1(name string, age int, gpa double) partitioned by 
 (b string) stored as textfile;
 OK
 Time taken: 0.732 seconds
 hive alter table table_1 add partition (b='2010-10-10');
 OK
 Time taken: 0.496 seconds
 hive show partitions table_1;
 OK
 b=2010-10-10
 Time taken: 0.781 seconds, Fetched: 1 row(s)
 hive alter table table_1 drop partition (b='2010-10-10');
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
 Failed to move to trash: 
 hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10
 hive show partitions table_1;
 OK
 Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10872) LLAP: make sure tests pass

2015-06-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573558#comment-14573558
 ] 

Sergey Shelukhin commented on HIVE-10872:
-

Btw, I ran main (non-itest) tests locally, and they all pass except some 
obscure avro test that is probably just related to running on mac

 LLAP: make sure tests pass
 --

 Key: HIVE-10872
 URL: https://issues.apache.org/jira/browse/HIVE-10872
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10872.01.patch, HIVE-10872.02.patch, 
 HIVE-10872.03.patch, HIVE-10872.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10872) LLAP: make sure tests pass

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10872:

Attachment: HIVE-10872.03.patch

This should build.

 LLAP: make sure tests pass
 --

 Key: HIVE-10872
 URL: https://issues.apache.org/jira/browse/HIVE-10872
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10872.01.patch, HIVE-10872.02.patch, 
 HIVE-10872.03.patch, HIVE-10872.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-06-04 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573596#comment-14573596
 ] 

Mithun Radhakrishnan commented on HIVE-10754:
-

I see what we're trying to achieve, but I still need help understanding how 
this change fixes the problem. (Sorry. :/) 

Here's the relevant code from {{Job.java}} from Hadoop 2.6.

{code:java|title=Job.java|borderStyle=solid|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE}
  @Deprecated
  public Job(Configuration conf) throws IOException {
this(new JobConf(conf));
  }

  Job(JobConf conf) throws IOException {
super(conf, null);
// propagate existing user credentials to job
this.credentials.mergeAll(this.ugi.getCredentials());
this.cluster = null;
  }

 public static Job getInstance(Configuration conf) throws IOException {
// create with a null Cluster
JobConf jobConf = new JobConf(conf);
return new Job(jobConf);
  }
{code}

# The current implementation of {{HCatLoader.setLocation()}} calls {{new Job( 
Configuration )}}, which clones the {{JobConf}} inline and calls the private 
constructor {{Job(JobConf)}}.
# Your improved implementation of {{HCatLoader.setLocation()}} calls 
{{Job.getInstance()}}. This method clones the {{JobConf}} explicitly, and then 
calls the private constructor {{Job(jobConf)}}.

bq. These two are different (JobConf is not cloned when we call new Job(conf)).
Both of these seem identical in effect to me. :/ There's no way for 
{{HCatLoader.setLocation()}} to call the {{Job(JobConf)}} constructor, because 
it's package-private, right?


 Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
 HCatLoader
 

 Key: HIVE-10754
 URL: https://issues.apache.org/jira/browse/HIVE-10754
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10754.patch


 {noformat}
 Create table tbl1 (key string, value string) stored as rcfile;
 Create table tbl2 (key string, value string);
 insert into tbl1 values( '1', '111');
 insert into tbl2 values('1', '2');
 {noformat}
 Pig script:
 {noformat}
 src_tbl1 = FILTER tbl1 BY (key == '1');
 prj_tbl1 = FOREACH src_tbl1 GENERATE
key as tbl1_key,
value as tbl1_value,
'333' as tbl1_v1;

 src_tbl2 = FILTER tbl2 BY (key == '1');
 prj_tbl2 = FOREACH src_tbl2 GENERATE
key as tbl2_key,
value as tbl2_value;

 dump prj_tbl1;
 dump prj_tbl2;
 result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
 prj_result = FOREACH result 
   GENERATE  prj_tbl1::tbl1_key AS key1,
 prj_tbl1::tbl1_value AS value1,
 prj_tbl1::tbl1_v1 AS v1,
 prj_tbl2::tbl2_key AS key2,
 prj_tbl2::tbl2_value AS value2;

 dump prj_result;
 {noformat}
 The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
 need to clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors

2015-06-04 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-10779:
-

Assignee: Siddharth Seth

 LLAP: Daemons should shutdown in case of fatal errors
 -

 Key: HIVE-10779
 URL: https://issues.apache.org/jira/browse/HIVE-10779
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-10779.1.txt


 For example, the scheduler loop exiting. Currently they end up getting stuck 
 - while still accepting new work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors

2015-06-04 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10779:
--
Attachment: HIVE-10779.1.txt

Patch adds an UncauhtExceptionHandler and a shutdown hook to stop services.

 LLAP: Daemons should shutdown in case of fatal errors
 -

 Key: HIVE-10779
 URL: https://issues.apache.org/jira/browse/HIVE-10779
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
 Attachments: HIVE-10779.1.txt


 For example, the scheduler loop exiting. Currently they end up getting stuck 
 - while still accepting new work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573730#comment-14573730
 ] 

Hive QA commented on HIVE-10934:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737678/HIVE-10934.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9001 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4180/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4180/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4180/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737678 - PreCommit-HIVE-TRUNK-Build

 Restore support for DROP PARTITION PURGE
 

 Key: HIVE-10934
 URL: https://issues.apache.org/jira/browse/HIVE-10934
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10934.patch


 HIVE-9086 added support for PURGE in 
 {noformat}
 ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) 
 IGNORE PROTECTION PURGE;
 {noformat}
 looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573812#comment-14573812
 ] 

Chao Sun commented on HIVE-10427:
-

Yes, it should also work for branch-1. I'll commit it to that branch later and 
update the tag.

 collect_list() and collect_set() should accept struct types as argument
 ---

 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm
Assignee: Chao Sun
  Labels: TODOC2.0
 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
 HIVE-10427.3.patch, HIVE-10427.4.patch


 The collect_list() and collect_set() functions currently only accept scalar 
 argument types. It would be very useful if these functions could also accept 
 struct argument types for creating nested data from flat data.
 For example, suppose I wanted to create a nested customers/orders table from 
 two flat tables, customers and orders. Then it'd be very convenient to write 
 something like this:
 {code}
 insert into table nested_customers_orders
 select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...))
 from customers c inner join orders o on (c.cid = o.oid)
 group by c.cid
 {code}
 Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573665#comment-14573665
 ] 

Gunther Hagleitner commented on HIVE-10934:
---

+1

 Restore support for DROP PARTITION PURGE
 

 Key: HIVE-10934
 URL: https://issues.apache.org/jira/browse/HIVE-10934
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10934.patch


 HIVE-9086 added support for PURGE in 
 {noformat}
 ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) 
 IGNORE PROTECTION PURGE;
 {noformat}
 looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors

2015-06-04 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-10779.
---
   Resolution: Fixed
Fix Version/s: llap

Committed to the llap branch.

 LLAP: Daemons should shutdown in case of fatal errors
 -

 Key: HIVE-10779
 URL: https://issues.apache.org/jira/browse/HIVE-10779
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Fix For: llap

 Attachments: HIVE-10779.1.txt


 For example, the scheduler loop exiting. Currently they end up getting stuck 
 - while still accepting new work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573757#comment-14573757
 ] 

Gunther Hagleitner commented on HIVE-10934:
---

Test failures are unrelated.

 Restore support for DROP PARTITION PURGE
 

 Key: HIVE-10934
 URL: https://issues.apache.org/jira/browse/HIVE-10934
 Project: Hive
  Issue Type: Bug
  Components: Parser
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10934.patch


 HIVE-9086 added support for PURGE in 
 {noformat}
 ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) 
 IGNORE PROTECTION PURGE;
 {noformat}
 looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false

2015-06-04 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10551:
--
Assignee: Matt McCline

 OOM when running query_89 with vectorization on  hybridgrace=false
 ---

 Key: HIVE-10551
 URL: https://issues.apache.org/jira/browse/HIVE-10551
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Matt McCline
 Attachments: HIVE-10551-explain-plan.log, hive-10551.png, 
 hive_10551.png


 - TPC-DS Query_89 @ 10 TB scale
 - Trunk version of Hive + Tez 0.7.0-SNAPSHOT
 - Additional settings ( hive.vectorized.groupby.maxentries=1024 , 
 tez.runtime.io.sort.factor=200  tez.runtime.io.sort.mb=1800 
 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false )
 Will attach the profiler snapshot asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7292) Hive on Spark

2015-06-04 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-7292:
-
Assignee: Xuefu Zhang  (was: dutianmin)

 Hive on Spark
 -

 Key: HIVE-7292
 URL: https://issues.apache.org/jira/browse/HIVE-7292
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
  Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5
 Attachments: Hive-on-Spark.pdf


 Spark as an open-source data analytics cluster computing framework has gained 
 significant momentum recently. Many Hive users already have Spark installed 
 as their computing backbone. To take advantages of Hive, they still need to 
 have either MapReduce or Tez on their cluster. This initiative will provide 
 user a new alternative so that those user can consolidate their backend. 
 Secondly, providing such an alternative further increases Hive's adoption as 
 it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
 on Hadoop.
 Finally, allowing Hive to run on Spark also has performance benefits. Hive 
 queries, especially those involving multiple reducer stages, will run faster, 
 thus improving user experience as Tez does.
 This is an umbrella JIRA which will cover many coming subtask. Design doc 
 will be attached here shortly, and will be on the wiki as well. Feedback from 
 the community is greatly appreciated!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573177#comment-14573177
 ] 

Ashutosh Chauhan commented on HIVE-10932:
-

+1

 Unit test udf_nondeterministic failure due to HIVE-10728
 

 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10932.patch


 The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
 which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean

2015-06-04 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10736:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-7926)

 HiveServer2 shutdown of cached tez app-masters is not clean
 ---

 Key: HIVE-10736
 URL: https://issues.apache.org/jira/browse/HIVE-10736
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Gopal V
Assignee: Vikram Dixit K
 Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch


 The shutdown process throws concurrent modification exceptions and fails to 
 clean up the app masters per queue.
 {code}
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:CLIService is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: tez.TezSessionState 
 (TezSessionState.java:close(332)) - Closing Tez Session
 2015-05-17 20:24:00,466 INFO  [Thread-6()]: client.TezClient 
 (TezClient.java:stop(495)) - Shutting down Tez Session, 
 sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, 
 applicationId=application_1431919257083_0012
 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 
 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error 
 during stop of HiveServer2. Shutting down HiveServer2 anyway.
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
 at java.util.LinkedList$ListItr.next(LinkedList.java:888)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187)
 at 
 org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320)
 at 
 org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573267#comment-14573267
 ] 

Gunther Hagleitner commented on HIVE-10929:
---

+1 patch looks good. Test failures: udaf_histogram_numeric, 
ql_rewrite_gbtoidx_cbo_2, udf_nondeterministic are unrelated. The other 
differences look like the might actually be correct ... can you validate the 
what the stats should be here?

 In Tez mode,dynamic partitioning query with union all fails at 
 moveTask,Invalid partition key  values
 --

 Key: HIVE-10929
 URL: https://issues.apache.org/jira/browse/HIVE-10929
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10929.1.patch


 {code}
 create table dummy(i int);
 insert into table dummy values (1);
 select * from dummy;
 create table partunion1(id1 int) partitioned by (part1 string);
 set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.execution.engine=tez;
 explain insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 select * from partunion1;
 {code}
 fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573347#comment-14573347
 ] 

Vaibhav Gumashta commented on HIVE-10925:
-

Test failures are unrelated.

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument

2015-06-04 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573363#comment-14573363
 ] 

Alexander Pivovarov commented on HIVE-10427:


+1

 collect_list() and collect_set() should accept struct types as argument
 ---

 Key: HIVE-10427
 URL: https://issues.apache.org/jira/browse/HIVE-10427
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Alexander Behm
Assignee: Chao Sun
 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, 
 HIVE-10427.3.patch, HIVE-10427.4.patch


 The collect_list() and collect_set() functions currently only accept scalar 
 argument types. It would be very useful if these functions could also accept 
 struct argument types for creating nested data from flat data.
 For example, suppose I wanted to create a nested customers/orders table from 
 two flat tables, customers and orders. Then it'd be very convenient to write 
 something like this:
 {code}
 insert into table nested_customers_orders
 select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...))
 from customers c inner join orders o on (c.cid = o.oid)
 group by c.cid
 {code}
 Thanks you for your consideration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573238#comment-14573238
 ] 

Hive QA commented on HIVE-10932:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737565/HIVE-10932.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8998 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4177/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4177/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4177/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737565 - PreCommit-HIVE-TRUNK-Build

 Unit test udf_nondeterministic failure due to HIVE-10728
 

 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10932.patch


 The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
 which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean

2015-06-04 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10736:
--
Summary: HiveServer2 shutdown of cached tez app-masters is not clean  (was: 
LLAP: HiveServer2 shutdown of cached tez app-masters is not clean)

 HiveServer2 shutdown of cached tez app-masters is not clean
 ---

 Key: HIVE-10736
 URL: https://issues.apache.org/jira/browse/HIVE-10736
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Gopal V
Assignee: Vikram Dixit K
 Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch


 The shutdown process throws concurrent modification exceptions and fails to 
 clean up the app masters per queue.
 {code}
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:CLIService is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: tez.TezSessionState 
 (TezSessionState.java:close(332)) - Closing Tez Session
 2015-05-17 20:24:00,466 INFO  [Thread-6()]: client.TezClient 
 (TezClient.java:stop(495)) - Shutting down Tez Session, 
 sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, 
 applicationId=application_1431919257083_0012
 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 
 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error 
 during stop of HiveServer2. Shutting down HiveServer2 anyway.
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
 at java.util.LinkedList$ListItr.next(LinkedList.java:888)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187)
 at 
 org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320)
 at 
 org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10869) fold_case.q failing on trunk

2015-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10869:

Fix Version/s: 1.2.1

 fold_case.q failing on trunk
 

 Key: HIVE-10869
 URL: https://issues.apache.org/jira/browse/HIVE-10869
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10869.patch


 Race condition of commits between HIVE-10716  HIVE-10812



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10869) fold_case.q failing on trunk

2015-06-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10869:

Affects Version/s: (was: 1.3.0)
   1.2.1

 fold_case.q failing on trunk
 

 Key: HIVE-10869
 URL: https://issues.apache.org/jira/browse/HIVE-10869
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10869.patch


 Race condition of commits between HIVE-10716  HIVE-10812



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10920) LLAP: elevator reads some useless data even if all RGs are eliminated by SARG

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10920.
-
Resolution: Fixed

committed to branch... 
{noformat}
TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)]
 IN
FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for 
[cn041-10.l42scl.hortonworks.com/172.19.128.41, 
tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, 
NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS
=0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, 
CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL
LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, 
HDFS_TIME_US=0, CONSUMER_TIME_US=934 ]
{noformat}

 LLAP: elevator reads some useless data even if all RGs are eliminated by SARG
 -

 Key: HIVE-10920
 URL: https://issues.apache.org/jira/browse/HIVE-10920
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10920) LLAP: elevator reads some useless data even if all RGs are eliminated by SARG

2015-06-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573336#comment-14573336
 ] 

Sergey Shelukhin edited comment on HIVE-10920 at 6/4/15 6:31 PM:
-

committed to branch... 
{noformat}
TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)]
 IN
FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for 
[snip, 
tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, 
NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS
=0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, 
CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL
LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, 
HDFS_TIME_US=0, CONSUMER_TIME_US=934 ]
{noformat}


was (Author: sershe):
committed to branch... 
{noformat}
TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)]
 IN
FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for 
[cn041-10.l42scl.hortonworks.com/172.19.128.41, 
tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, 
NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS
=0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, 
CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL
LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, 
HDFS_TIME_US=0, CONSUMER_TIME_US=934 ]
{noformat}

 LLAP: elevator reads some useless data even if all RGs are eliminated by SARG
 -

 Key: HIVE-10920
 URL: https://issues.apache.org/jira/browse/HIVE-10920
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573367#comment-14573367
 ] 

Gunther Hagleitner commented on HIVE-10907:
---

I think the check is too restrictive? (i.e. all sides need to have same size of 
rs) - the commented out code looks better :-)

 Hive on Tez: Classcast exception in some cases with SMB joins
 -

 Key: HIVE-10907
 URL: https://issues.apache.org/jira/browse/HIVE-10907
 Project: Hive
  Issue Type: Bug
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10907.1.patch, HIVE-10907.2.patch, 
 HIVE-10907.3.patch


 In cases where there is a mix of Map side work and reduce side work, we get a 
 classcast exception because we assume homogeneity in the code. We need to fix 
 this correctly. For now this is a workaround.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573274#comment-14573274
 ] 

Gunther Hagleitner commented on HIVE-10736:
---

+1 branch-1.2 as well?

 HiveServer2 shutdown of cached tez app-masters is not clean
 ---

 Key: HIVE-10736
 URL: https://issues.apache.org/jira/browse/HIVE-10736
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Gopal V
Assignee: Vikram Dixit K
 Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch


 The shutdown process throws concurrent modification exceptions and fails to 
 clean up the app masters per queue.
 {code}
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
 2015-05-17 20:24:00,464 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-9()]: tez.TezSessionPoolManager 
 (TezSessionPoolManager.java:close(175)) - Closing tez session default? true
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:CLIService is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: service.AbstractService 
 (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped.
 2015-05-17 20:24:00,465 INFO  [Thread-6()]: tez.TezSessionState 
 (TezSessionState.java:close(332)) - Closing Tez Session
 2015-05-17 20:24:00,466 INFO  [Thread-6()]: client.TezClient 
 (TezClient.java:stop(495)) - Shutting down Tez Session, 
 sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, 
 applicationId=application_1431919257083_0012
 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 
 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error 
 during stop of HiveServer2. Shutting down HiveServer2 anyway.
 java.util.ConcurrentModificationException
 at 
 java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
 at java.util.LinkedList$ListItr.next(LinkedList.java:888)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187)
 at 
 org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320)
 at 
 org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive

2015-06-04 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573309#comment-14573309
 ] 

Sushanth Sowmyan commented on HIVE-10761:
-

This looks very useful, thanks!

Also, I would suggest deprecating the current metrics system targeting removal 
in a couple of releases. I don't think it has been used much outside of Yahoo - 
[~mithun] can clarify if they still care about it.

 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.3.0

 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, 
 HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, 
 hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10914) LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10914.
-
Resolution: Fixed

the 2nd part committed to branch... hadoop-1 build now works for me even with 
clean maven repo

 LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build
 -

 Key: HIVE-10914
 URL: https://issues.apache.org/jira/browse/HIVE-10914
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap


 LLAP won't ever work with hadoop 1, so no point in building it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573361#comment-14573361
 ] 

Vaibhav Gumashta commented on HIVE-10925:
-

cc [~ekoifman] [~alangates] 

I'm making the transaction handler a static threadlocal in this patch. Can you 
review that change?

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-06-04 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573232#comment-14573232
 ] 

Mithun Radhakrishnan commented on HIVE-10754:
-

Hello, Aihua. I'm all for switching from the deprecated {{Job}} constructor to 
using {{Job.getInstance()}}.

But I am unable to understand how this changes/fixes anything. Both {{new 
Job(Configuration)}} and {{Job.getInstance(Configuration)}} seem to eventually 
use the package-private {{Job(JobConf)}} constructor. No latter references to 
{{clone}} or {{job}} have been modified in {{HCatLoader.setLocation()}}.

Could you please explain your intention?

 Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
 HCatLoader
 

 Key: HIVE-10754
 URL: https://issues.apache.org/jira/browse/HIVE-10754
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10754.patch


 {noformat}
 Create table tbl1 (key string, value string) stored as rcfile;
 Create table tbl2 (key string, value string);
 insert into tbl1 values( '1', '111');
 insert into tbl2 values('1', '2');
 {noformat}
 Pig script:
 {noformat}
 src_tbl1 = FILTER tbl1 BY (key == '1');
 prj_tbl1 = FOREACH src_tbl1 GENERATE
key as tbl1_key,
value as tbl1_value,
'333' as tbl1_v1;

 src_tbl2 = FILTER tbl2 BY (key == '1');
 prj_tbl2 = FOREACH src_tbl2 GENERATE
key as tbl2_key,
value as tbl2_value;

 dump prj_tbl1;
 dump prj_tbl2;
 result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
 prj_result = FOREACH result 
   GENERATE  prj_tbl1::tbl1_key AS key1,
 prj_tbl1::tbl1_value AS value1,
 prj_tbl1::tbl1_v1 AS v1,
 prj_tbl2::tbl2_key AS key2,
 prj_tbl2::tbl2_value AS value2;

 dump prj_result;
 {noformat}
 The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
 need to clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573255#comment-14573255
 ] 

Aihua Xu commented on HIVE-10932:
-

Those failures are unrelated to the patch.

 Unit test udf_nondeterministic failure due to HIVE-10728
 

 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10932.patch


 The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
 which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10869) fold_case.q failing on trunk

2015-06-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573307#comment-14573307
 ] 

Ashutosh Chauhan commented on HIVE-10869:
-

Cherry-picked on 1.2 as well.

 fold_case.q failing on trunk
 

 Key: HIVE-10869
 URL: https://issues.apache.org/jira/browse/HIVE-10869
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 1.2.1
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10869.patch


 Race condition of commits between HIVE-10716  HIVE-10812



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-10914) LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-10914:
-

I should clean my maven repo before testing this locally :(

 LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build
 -

 Key: HIVE-10914
 URL: https://issues.apache.org/jira/browse/HIVE-10914
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap


 LLAP won't ever work with hadoop 1, so no point in building it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()

2015-06-04 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-10933:
--

Assignee: Chaoyu Tang

 Hive 0.13 returns precision 0 for varchar(32) from 
 DatabaseMetadata.getColumns()
 

 Key: HIVE-10933
 URL: https://issues.apache.org/jira/browse/HIVE-10933
 Project: Hive
  Issue Type: Bug
  Components: API
Affects Versions: 0.13.0
Reporter: Son Nguyen
Assignee: Chaoyu Tang

 DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined 
 as varchar(32), or char(32).   While ResultSetMetaData.getPrecision() returns 
 correct value 32.
 Here is the segment program that reproduces the issue.
 try {
   statement = connection.createStatement();
   
   statement.execute(drop table if exists son_table);
   
   statement.execute(create table son_table( col1 varchar(32) ));
   
   statement.close();
   
 } catch ( Exception e) {
  return;
 } 
   
 // get column info using metadata
 try {
   DatabaseMetaData dmd = null;
   ResultSet resultSet = null;
   
   dmd = connection.getMetaData();
   
   resultSet = dmd.getColumns(null, null, son_table, col1);
   
   if ( resultSet.next() ) {
   String tabName = resultSet.getString(TABLE_NAME);
   String colName = resultSet.getString(COLUMN_NAME);
   String dataType = resultSet.getString(DATA_TYPE);
   String typeName = resultSet.getString(TYPE_NAME);
   int precision = resultSet.getInt(COLUMN_SIZE);
   
   // output is: colName = col1, dataType = 12, typeName = 
 VARCHAR, precision = 0.
 System.out.format(colName = %s, dataType = %s, typeName = %s, 
 precision = %d.,
   colName, dataType, typeName, precision);
   }
 } catch ( Exception e) {
   return;
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-04 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573342#comment-14573342
 ] 

Alexander Pivovarov commented on HIVE-10841:


Better to put code/sql/plan to \{code\}...\{code\} blocks. It will be easier to 
read

 [WHERE col is not null] does not work sometimes for queries with many JOIN 
 statements
 -

 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-10841.patch


 The result from the following SELECT query is 3 rows but it should be 1 row.
 I checked it in MySQL - it returned 1 row.
 To reproduce the issue in Hive
 1. prepare tables
 {code}
 drop table if exists L;
 drop table if exists LA;
 drop table if exists FR;
 drop table if exists A;
 drop table if exists PI;
 drop table if exists acct;
 create table L as select 4436 id;
 create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
 create table FR as select 4436 loan_id;
 create table A as select 4748 id;
 create table PI as select 4415 id;
 create table acct as select 4748 aid, 10 acc_n, 122 brn;
 insert into table acct values(4748, null, null);
 insert into table acct values(4748, null, null);
 {code}
 2. run SELECT query
 {code}
 select
   acct.ACC_N,
   acct.brn
 FROM L
 JOIN LA ON L.id = LA.loan_id
 JOIN FR ON L.id = FR.loan_id
 JOIN A ON LA.aid = A.id
 JOIN PI ON PI.id = LA.pi_id
 JOIN acct ON A.id = acct.aid
 WHERE
   L.id = 4436
   and acct.brn is not null;
 {code}
 the result is 3 rows
 {code}
 10122
 NULL  NULL
 NULL  NULL
 {code}
 but it should be 1 row
 {code}
 10122
 {code}
 2.1 explain select ... output for hive-1.3.0 MR
 {code}
 STAGE DEPENDENCIES:
   Stage-12 is a root stage
   Stage-9 depends on stages: Stage-12
   Stage-0 depends on stages: Stage-9
 STAGE PLANS:
   Stage: Stage-12
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
 acct 
   Fetch Operator
 limit: -1
 fr 
   Fetch Operator
 limit: -1
 l 
   Fetch Operator
 limit: -1
 pi 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 acct 
   TableScan
 alias: acct
 Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
 Column stats: NONE
 Filter Operator
   predicate: aid is not null (type: boolean)
   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 fr 
   TableScan
 alias: fr
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (loan_id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 l 
   TableScan
 alias: l
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 pi 
   TableScan
 alias: pi
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 

[jira] [Updated] (HIVE-10919) Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath

2015-06-04 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10919:
-
Fix Version/s: 1.2.1

 Windows: create table with JsonSerDe failed via beeline unless you add 
 hcatalog core jar to classpath
 -

 Key: HIVE-10919
 URL: https://issues.apache.org/jira/browse/HIVE-10919
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10919.1.patch


 NO PRECOMMIT TESTS
 Before we run HiveServer2 tests, we create table via beeline.
 And 'create table' with JsonSerDe failed on Winodws. It works on Linux:
 {noformat}
 0: jdbc:hive2://localhost:10001 create external table all100kjson(
 0: jdbc:hive2://localhost:10001 s string,
 0: jdbc:hive2://localhost:10001 i int,
 0: jdbc:hive2://localhost:10001 d double,
 0: jdbc:hive2://localhost:10001 m mapstring, string,
 0: jdbc:hive2://localhost:10001 bb arraystructa: int, b: string,
 0: jdbc:hive2://localhost:10001 t timestamp)
 0: jdbc:hive2://localhost:10001 row format serde 
 'org.apache.hive.hcatalog.data.JsonSerDe'
 0: jdbc:hive2://localhost:10001 WITH SERDEPROPERTIES 
 ('timestamp.formats'='-MM-dd\'T\'HH:mm:ss')
 0: jdbc:hive2://localhost:10001 STORED AS TEXTFILE
 0: jdbc:hive2://localhost:10001 location '/user/hcat/tests/data/all100kjson';
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLT
 ask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe 
 (state=08S01,code=1)
 {noformat}
 hive.log shows:
 {noformat}
 2015-05-21 21:59:17,004 ERROR operation.Operation 
 (SQLOperation.java:run(209)) - Error running hive query: 
 org.apache.hive.service.cli.HiveSQLException: Error while processing 
 statement: FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: 
 org.apache.hive.hcatalog.data.JsonSerDe
   at 
 org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate 
 serde: org.apache.hive.hcatalog.data.JsonSerDe
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3871)
   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4011)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
   ... 11 more
 Caused by: java.lang.ClassNotFoundException: Class 
 org.apache.hive.hcatalog.data.JsonSerDe not found
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3865)
   ... 21 more
 {noformat}
 If you do add the hcatalog jar to classpath, it works:
 {noformat}0: jdbc:hive2://localhost:10001 add jar 
 hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar;
 INFO  : converting to local 
 hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar
 INFO  : Added 
 

[jira] [Commented] (HIVE-10919) Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath

2015-06-04 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573499#comment-14573499
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10919:
--

Committed to branch-1.2 as well.

 Windows: create table with JsonSerDe failed via beeline unless you add 
 hcatalog core jar to classpath
 -

 Key: HIVE-10919
 URL: https://issues.apache.org/jira/browse/HIVE-10919
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10919.1.patch


 NO PRECOMMIT TESTS
 Before we run HiveServer2 tests, we create table via beeline.
 And 'create table' with JsonSerDe failed on Winodws. It works on Linux:
 {noformat}
 0: jdbc:hive2://localhost:10001 create external table all100kjson(
 0: jdbc:hive2://localhost:10001 s string,
 0: jdbc:hive2://localhost:10001 i int,
 0: jdbc:hive2://localhost:10001 d double,
 0: jdbc:hive2://localhost:10001 m mapstring, string,
 0: jdbc:hive2://localhost:10001 bb arraystructa: int, b: string,
 0: jdbc:hive2://localhost:10001 t timestamp)
 0: jdbc:hive2://localhost:10001 row format serde 
 'org.apache.hive.hcatalog.data.JsonSerDe'
 0: jdbc:hive2://localhost:10001 WITH SERDEPROPERTIES 
 ('timestamp.formats'='-MM-dd\'T\'HH:mm:ss')
 0: jdbc:hive2://localhost:10001 STORED AS TEXTFILE
 0: jdbc:hive2://localhost:10001 location '/user/hcat/tests/data/all100kjson';
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLT
 ask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe 
 (state=08S01,code=1)
 {noformat}
 hive.log shows:
 {noformat}
 2015-05-21 21:59:17,004 ERROR operation.Operation 
 (SQLOperation.java:run(209)) - Error running hive query: 
 org.apache.hive.service.cli.HiveSQLException: Error while processing 
 statement: FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: 
 org.apache.hive.hcatalog.data.JsonSerDe
   at 
 org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate 
 serde: org.apache.hive.hcatalog.data.JsonSerDe
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3871)
   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4011)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
   ... 11 more
 Caused by: java.lang.ClassNotFoundException: Class 
 org.apache.hive.hcatalog.data.JsonSerDe not found
   at 
 org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3865)
   ... 21 more
 {noformat}
 If you do add the hcatalog jar to classpath, it works:
 {noformat}0: jdbc:hive2://localhost:10001 add jar 
 hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar;
 INFO  : converting to local 
 hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar
 INFO  : 

[jira] [Assigned] (HIVE-10935) LLAP: merge master to branch

2015-06-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10935:
---

Assignee: Sergey Shelukhin

 LLAP: merge master to branch
 

 Key: HIVE-10935
 URL: https://issues.apache.org/jira/browse/HIVE-10935
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573072#comment-14573072
 ] 

Hive QA commented on HIVE-10929:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737504/HIVE-10929.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_leftsemi_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_multi_insert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4174/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4174/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4174/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737504 - PreCommit-HIVE-TRUNK-Build

 In Tez mode,dynamic partitioning query with union all fails at 
 moveTask,Invalid partition key  values
 --

 Key: HIVE-10929
 URL: https://issues.apache.org/jira/browse/HIVE-10929
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10929.1.patch


 {code}
 create table dummy(i int);
 insert into table dummy values (1);
 select * from dummy;
 create table partunion1(id1 int) partitioned by (part1 string);
 set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.execution.engine=tez;
 explain insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 insert into table partunion1 partition(part1)
 select temps.* from (
 select 1 as id1, '2014' as part1 from dummy 
 union all 
 select 2 as id1, '2014' as part1 from dummy ) temps;
 select * from partunion1;
 {code}
 fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-06-04 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572981#comment-14572981
 ] 

Chaoyu Tang commented on HIVE-10410:


I think not only the Hive object, but also sessionState and HiveConf shared in 
child threads may also cause the race issue.

 Apparent race condition in HiveServer2 causing intermittent query failures
 --

 Key: HIVE-10410
 URL: https://issues.apache.org/jira/browse/HIVE-10410
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
 Environment: CDH 5.3.3
 CentOS 6.4
Reporter: Richard Williams

 On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
 occasionally trigger odd Thrift exceptions with messages such as Read a 
 negative frame size (-2147418110)! or out of sequence response in 
 HiveServer2's connections to the metastore. For certain metastore calls (for 
 example, showDatabases), these Thrift exceptions are converted to 
 MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
 from retrying these calls and thus causes the failure to bubble out to the 
 JDBC client.
 Note that as far as we can tell, this issue appears to only affect queries 
 that are submitted with the runAsync flag on TExecuteStatementReq set to true 
 (which, in practice, seems to mean all JDBC queries), and it appears to only 
 manifest when HiveServer2 is using the new HTTP transport mechanism. When 
 both these conditions hold, we are able to fairly reliably reproduce the 
 issue by spawning about 100 simple, concurrent hive queries (we have been 
 using show databases), two or three of which typically fail. However, when 
 either of these conditions do not hold, we are no longer able to reproduce 
 the issue.
 Some example stack traces from the HiveServer2 logs:
 {noformat}
 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException Read a negative frame size 
 (-2147418110)!
 org.apache.thrift.transport.TTransportException: Read a negative frame size 
 (-2147418110)!
 at 
 org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
 at 
 org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
 at 
 org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
 at 
 org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
 at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
 at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
 at 

[jira] [Updated] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-06-04 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10754:

Description: 
{noformat}
Create table tbl1 (key string, value string) stored as rcfile;
Create table tbl2 (key string, value string);
insert into tbl1 values( '1', '111');
insert into tbl2 values('1', '2');
{noformat}

Pig script:
{noformat}
src_tbl1 = FILTER tbl1 BY (key == '1');
prj_tbl1 = FOREACH src_tbl1 GENERATE
   key as tbl1_key,
   value as tbl1_value,
   '333' as tbl1_v1;
   
src_tbl2 = FILTER tbl2 BY (key == '1');
prj_tbl2 = FOREACH src_tbl2 GENERATE
   key as tbl2_key,
   value as tbl2_value;
   
dump prj_tbl1;
dump prj_tbl2;
result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
prj_result = FOREACH result 
  GENERATE  prj_tbl1::tbl1_key AS key1,
prj_tbl1::tbl1_value AS value1,
prj_tbl1::tbl1_v1 AS v1,
prj_tbl2::tbl2_key AS key2,
prj_tbl2::tbl2_value AS value2;
   
dump prj_result;
{noformat}

The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
need to clone the job instance in HCatLoader.


 Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
 HCatLoader
 

 Key: HIVE-10754
 URL: https://issues.apache.org/jira/browse/HIVE-10754
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10754.patch


 {noformat}
 Create table tbl1 (key string, value string) stored as rcfile;
 Create table tbl2 (key string, value string);
 insert into tbl1 values( '1', '111');
 insert into tbl2 values('1', '2');
 {noformat}
 Pig script:
 {noformat}
 src_tbl1 = FILTER tbl1 BY (key == '1');
 prj_tbl1 = FOREACH src_tbl1 GENERATE
key as tbl1_key,
value as tbl1_value,
'333' as tbl1_v1;

 src_tbl2 = FILTER tbl2 BY (key == '1');
 prj_tbl2 = FOREACH src_tbl2 GENERATE
key as tbl2_key,
value as tbl2_value;

 dump prj_tbl1;
 dump prj_tbl2;
 result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
 prj_result = FOREACH result 
   GENERATE  prj_tbl1::tbl1_key AS key1,
 prj_tbl1::tbl1_value AS value1,
 prj_tbl1::tbl1_v1 AS v1,
 prj_tbl2::tbl2_key AS key2,
 prj_tbl2::tbl2_value AS value2;

 dump prj_result;
 {noformat}
 The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
 need to clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader

2015-06-04 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573060#comment-14573060
 ] 

Aihua Xu commented on HIVE-10754:
-

[~mithun] Can you help review the change? Thanks.

 Pig+Hcatalog doesn't work properly since we need to clone the Job instance in 
 HCatLoader
 

 Key: HIVE-10754
 URL: https://issues.apache.org/jira/browse/HIVE-10754
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10754.patch


 {noformat}
 Create table tbl1 (key string, value string) stored as rcfile;
 Create table tbl2 (key string, value string);
 insert into tbl1 values( '1', '111');
 insert into tbl2 values('1', '2');
 {noformat}
 Pig script:
 {noformat}
 src_tbl1 = FILTER tbl1 BY (key == '1');
 prj_tbl1 = FOREACH src_tbl1 GENERATE
key as tbl1_key,
value as tbl1_value,
'333' as tbl1_v1;

 src_tbl2 = FILTER tbl2 BY (key == '1');
 prj_tbl2 = FOREACH src_tbl2 GENERATE
key as tbl2_key,
value as tbl2_value;

 dump prj_tbl1;
 dump prj_tbl2;
 result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
 prj_result = FOREACH result 
   GENERATE  prj_tbl1::tbl1_key AS key1,
 prj_tbl1::tbl1_value AS value1,
 prj_tbl1::tbl1_v1 AS v1,
 prj_tbl2::tbl2_key AS key2,
 prj_tbl2::tbl2_value AS value2;

 dump prj_result;
 {noformat}
 The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We 
 need to clone the job instance in HCatLoader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10921) Change trunk pom version to reflect the branch-1 split

2015-06-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572961#comment-14572961
 ] 

Alan Gates commented on HIVE-10921:
---

+1

 Change trunk pom version to reflect the branch-1 split
 --

 Key: HIVE-10921
 URL: https://issues.apache.org/jira/browse/HIVE-10921
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 2.0.0

 Attachments: HIVE-10921.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.

2015-06-04 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572804#comment-14572804
 ] 

Yongzhi Chen commented on HIVE-10880:
-

The implement of method private static String replaceTaskId(String taskId, int 
bucketNum) looks not right. For the code is in the source for a while, I am not 
very confident about that. 
Attached the patch 3 fixes that issue too. If the tests pass, should use 
patch3, otherwise keep patch2. 

 The bucket number is not respected in insert overwrite.
 ---

 Key: HIVE-10880
 URL: https://issues.apache.org/jira/browse/HIVE-10880
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Blocker
 Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, 
 HIVE-10880.3.patch


 When hive.enforce.bucketing is true, the bucket number defined in the table 
 is no longer respected in current master and 1.2. This is a regression.
 Reproduce:
 {noformat}
 CREATE TABLE IF NOT EXISTS buckettestinput( 
 data string 
 ) 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput1( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput2( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 Then I inserted the following data into the buckettestinput table
 firstinsert1 
 firstinsert2 
 firstinsert3 
 firstinsert4 
 firstinsert5 
 firstinsert6 
 firstinsert7 
 firstinsert8 
 secondinsert1 
 secondinsert2 
 secondinsert3 
 secondinsert4 
 secondinsert5 
 secondinsert6 
 secondinsert7 
 secondinsert8
 set hive.enforce.bucketing = true; 
 set hive.enforce.sorting=true;
 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%';
 set hive.auto.convert.sortmerge.join=true; 
 set hive.optimize.bucketmapjoin = true; 
 set hive.optimize.bucketmapjoin.sortedmerge = true; 
 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data);
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
 bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
 of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 
 (state=42000,code=10141)
 {noformat}
 The related debug information related to insert overwrite:
 {noformat}
 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%'insert overwrite table 
 buckettestoutput1 
 0: jdbc:hive2://localhost:1 ;
 select * from buckettestinput where data like ' 
 first%';
 INFO  : Number of reduce tasks determined at compile time: 2
 INFO  : In order to change the average load for a reducer (in bytes):
 INFO  :   set hive.exec.reducers.bytes.per.reducer=number
 INFO  : In order to limit the maximum number of reducers:
 INFO  :   set hive.exec.reducers.max=number
 INFO  : In order to set a constant number of reducers:
 INFO  :   set mapred.reduce.tasks=number
 INFO  : Job running in-process (local Hadoop)
 INFO  : 2015-06-01 11:09:29,650 Stage-1 map = 86%,  reduce = 100%
 INFO  : Ended Job = job_local107155352_0001
 INFO  : Loading data to table default.buckettestoutput1 from 
 file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1
 INFO  : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, 
 totalSize=52, rawDataSize=48]
 No rows affected (1.692 seconds)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10880) The bucket number is not respected in insert overwrite.

2015-06-04 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-10880:

Attachment: HIVE-10880.3.patch

 The bucket number is not respected in insert overwrite.
 ---

 Key: HIVE-10880
 URL: https://issues.apache.org/jira/browse/HIVE-10880
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
Priority: Blocker
 Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, 
 HIVE-10880.3.patch


 When hive.enforce.bucketing is true, the bucket number defined in the table 
 is no longer respected in current master and 1.2. This is a regression.
 Reproduce:
 {noformat}
 CREATE TABLE IF NOT EXISTS buckettestinput( 
 data string 
 ) 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput1( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 CREATE TABLE IF NOT EXISTS buckettestoutput2( 
 data string 
 )CLUSTERED BY(data) 
 INTO 2 BUCKETS 
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
 Then I inserted the following data into the buckettestinput table
 firstinsert1 
 firstinsert2 
 firstinsert3 
 firstinsert4 
 firstinsert5 
 firstinsert6 
 firstinsert7 
 firstinsert8 
 secondinsert1 
 secondinsert2 
 secondinsert3 
 secondinsert4 
 secondinsert5 
 secondinsert6 
 secondinsert7 
 secondinsert8
 set hive.enforce.bucketing = true; 
 set hive.enforce.sorting=true;
 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%';
 set hive.auto.convert.sortmerge.join=true; 
 set hive.optimize.bucketmapjoin = true; 
 set hive.optimize.bucketmapjoin.sortedmerge = true; 
 select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data);
 Error: Error while compiling statement: FAILED: SemanticException [Error 
 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use 
 bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number 
 of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 
 (state=42000,code=10141)
 {noformat}
 The related debug information related to insert overwrite:
 {noformat}
 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 
 select * from buckettestinput where data like 'first%'insert overwrite table 
 buckettestoutput1 
 0: jdbc:hive2://localhost:1 ;
 select * from buckettestinput where data like ' 
 first%';
 INFO  : Number of reduce tasks determined at compile time: 2
 INFO  : In order to change the average load for a reducer (in bytes):
 INFO  :   set hive.exec.reducers.bytes.per.reducer=number
 INFO  : In order to limit the maximum number of reducers:
 INFO  :   set hive.exec.reducers.max=number
 INFO  : In order to set a constant number of reducers:
 INFO  :   set mapred.reduce.tasks=number
 INFO  : Job running in-process (local Hadoop)
 INFO  : 2015-06-01 11:09:29,650 Stage-1 map = 86%,  reduce = 100%
 INFO  : Ended Job = job_local107155352_0001
 INFO  : Loading data to table default.buckettestoutput1 from 
 file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1
 INFO  : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, 
 totalSize=52, rawDataSize=48]
 No rows affected (1.692 seconds)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572905#comment-14572905
 ] 

Hive QA commented on HIVE-10925:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737489/HIVE-10925.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8998 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4173/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4173/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4173/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737489 - PreCommit-HIVE-TRUNK-Build

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10932:

Attachment: HIVE-10932.patch

 Unit test udf_nondeterministic failure due to HIVE-10728
 

 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10932.patch


 The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
 which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728

2015-06-04 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572882#comment-14572882
 ] 

Aihua Xu commented on HIVE-10932:
-

[~ashutoshc] Can you help review the test code?

 Unit test udf_nondeterministic failure due to HIVE-10728
 

 Key: HIVE-10932
 URL: https://issues.apache.org/jira/browse/HIVE-10932
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10932.patch


 The test udf_nondeterministic.q failed due to the change in HIVE-10728, in 
 which unix_timestamp() is now marked as deterministic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10922) In HS2 doAs=false mode, file system related errors in one query causes other failures

2015-06-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572287#comment-14572287
 ] 

Hive QA commented on HIVE-10922:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12737416/HIVE-10922.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8992 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4167/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4167/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4167/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12737416 - PreCommit-HIVE-TRUNK-Build

 In HS2 doAs=false mode, file system related errors in one query causes other 
 failures
 -

 Key: HIVE-10922
 URL: https://issues.apache.org/jira/browse/HIVE-10922
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0, 1.2.0, 1.1.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10922.1.patch


 Warehouse class has a few methods that close file system object on errors.
 With doAs=false, since all queries use the same HS2 ugi, the filesystem 
 object is shared across queries/threads. When the close on one filesystem 
 object gets called, it leads to filesystem object used in other threads also 
 get closed and any files registered for deletion on exit also getting deleted.
 There is also no close being done in case of the happy code path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10922) In HS2 doAs=false mode, file system related errors in one query causes other failures

2015-06-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572327#comment-14572327
 ] 

Gunther Hagleitner commented on HIVE-10922:
---

Test failures are unrelated.

 In HS2 doAs=false mode, file system related errors in one query causes other 
 failures
 -

 Key: HIVE-10922
 URL: https://issues.apache.org/jira/browse/HIVE-10922
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0, 1.2.0, 1.1.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10922.1.patch


 Warehouse class has a few methods that close file system object on errors.
 With doAs=false, since all queries use the same HS2 ugi, the filesystem 
 object is shared across queries/threads. When the close on one filesystem 
 object gets called, it leads to filesystem object used in other threads also 
 get closed and any files registered for deletion on exit also getting deleted.
 There is also no close being done in case of the happy code path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10904) Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]

2015-06-04 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572241#comment-14572241
 ] 

Chinna Rao Lalam commented on HIVE-10904:
-

Thanks [~leftylev], Linked this issue to HIVE-10810

 Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]
 --

 Key: HIVE-10904
 URL: https://issues.apache.org/jira/browse/HIVE-10904
 Project: Hive
  Issue Type: Sub-task
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10904.patch


 Updated CLI printing logs on the console. Use beeline-log4j.properties for 
 redirecting to file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10555) Improve windowing spec of range based windowing to support additional range formats

2015-06-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572250#comment-14572250
 ] 

Lefty Leverenz commented on HIVE-10555:
---

Doc note:  Subtasks that need documentation have been marked with TODOC1.3 
labels.

 Improve windowing spec of range based windowing to support additional range 
 formats
 ---

 Key: HIVE-10555
 URL: https://issues.apache.org/jira/browse/HIVE-10555
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Fix For: 1.3.0


 Currently windowing function only supports the formats of {{x preceding and 
 current}}, {{x preceding and y following}}, {{current and y following}}. 
 Windowing of {{x preceding and y preceding}} and {{x following and y 
 following}} doesn't work properly.
 The following functions should be supported.
 First_value(), last_value(), sum(), avg() , count(), min(), max() 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10928) Concurrent Beeline Connections can not work on different databases

2015-06-04 Thread chirag aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chirag aggarwal updated HIVE-10928:
---
Summary: Concurrent Beeline Connections can not work on different databases 
 (was: Concurrent Beeline Connections can not work different databases)

 Concurrent Beeline Connections can not work on different databases
 --

 Key: HIVE-10928
 URL: https://issues.apache.org/jira/browse/HIVE-10928
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 0.14.0
Reporter: chirag aggarwal

 The concurrent beeline connections are not able to work on different 
 databases. If one connection calls 'use abc', then all the connections start 
 working on database 'abc'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10925:

Affects Version/s: (was: 0.12.0)
   (was: 0.11.0)

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10904) Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]

2015-06-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572231#comment-14572231
 ] 

Lefty Leverenz commented on HIVE-10904:
---

Should this be documented?  If so, please link it to HIVE-10810 (Document 
Beeline/CLI changes).

 Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]
 --

 Key: HIVE-10904
 URL: https://issues.apache.org/jira/browse/HIVE-10904
 Project: Hive
  Issue Type: Sub-task
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10904.patch


 Updated CLI printing logs on the console. Use beeline-log4j.properties for 
 redirecting to file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-04 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572263#comment-14572263
 ] 

Laljo John Pullokkaran commented on HIVE-10841:
---

[~apivovarov] I see that predicate is being pushed down with the patch. See the 
attached explain below:
hive explain select acct.ACC_N,  acct.brn FROM L JOIN LA ON L.id = LA.loan_id 
JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = 
LA.pi_id JOIN acct ON A.id = acct.aid and acct.brn is not null WHERE  L.id = 
4436; 
OK
STAGE DEPENDENCIES:
  Stage-12 is a root stage
  Stage-9 depends on stages: Stage-12
  Stage-0 depends on stages: Stage-9

STAGE PLANS:
  Stage: Stage-12
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
acct 
  Fetch Operator
limit: -1
fr 
  Fetch Operator
limit: -1
l 
  Fetch Operator
limit: -1
pi 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
filterExpr: id is not null (type: boolean)
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: id is not null (type: boolean)
  Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
  HashTable Sink Operator
keys:
  0 _col5 (type: int)
  1 id (type: int)
  2 aid (type: int)
acct 
  TableScan
alias: acct
filterExpr: (brn is not null and aid is not null) (type: boolean)
Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: (brn is not null and aid is not null) (type: boolean)
  Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE 
Column stats: NONE
  HashTable Sink Operator
keys:
  0 _col5 (type: int)
  1 id (type: int)
  2 aid (type: int)
fr 
  TableScan
alias: fr
filterExpr: (loan_id = 4436) (type: boolean)
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: (loan_id = 4436) (type: boolean)
  Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
  HashTable Sink Operator
keys:
  0 4436 (type: int)
  1 4436 (type: int)
  2 4436 (type: int)
l 
  TableScan
alias: l
filterExpr: (id = 4436) (type: boolean)
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: (id = 4436) (type: boolean)
  Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
  HashTable Sink Operator
keys:
  0 4436 (type: int)
  1 4436 (type: int)
  2 4436 (type: int)
pi 
  TableScan
alias: pi
filterExpr: id is not null (type: boolean)
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: id is not null (type: boolean)
  Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
stats: NONE
  HashTable Sink Operator
keys:
  0 _col6 (type: int)
  1 id (type: int)

  Stage: Stage-9
Map Reduce
  Map Operator Tree:
  TableScan
alias: la
filterExpr: (((loan_id is not null and aid is not null) and pi_id 
is not null) and (loan_id = 4436)) (type: boolean)
Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: (((loan_id is not null and aid is not null) and pi_id 
is not null) and (loan_id = 4436)) (type: boolean)
  Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE 
Column stats: NONE
  Map Join Operator
condition map:
 Inner Join 0 to 1
 Inner Join 0 to 2
keys:
  0 4436 (type: int)
  1 4436 (type: int)
  2 4436 (type: int)
outputColumnNames: _col5, _col6
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE 
Column stats: NONE
Filter Operator
  predicate: _col5 is not null (type: boolean)
  

[jira] [Updated] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10925:

Attachment: HIVE-10925.1.patch

cc [~thejas] [~sushanth]

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0, 0.12.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

2015-06-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10826:
--
Labels: TODOC1.3  (was: )

 Support min()/max() functions over x preceding and y preceding windowing 
 -

 Key: HIVE-10826
 URL: https://issues.apache.org/jira/browse/HIVE-10826
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu
  Labels: TODOC1.3
 Fix For: 1.3.0

 Attachments: HIVE-10826.patch


 Currently the query 
 {noformat}
 select key, value, min(value) over (partition by key order by value rows 
 between 1 preceding and 1 preceding) from small;
 {noformat}
 doesn't work. It failed with 
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row (tag=0) 
 {key:{reducesinkkey0:2},value:{_col0:500}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) 
 {key:{reducesinkkey0:2},value:{_col0:500}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
 ... 3 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
 cannot generate all output rows for a Partition
 at 
 org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing

2015-06-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572246#comment-14572246
 ] 

Lefty Leverenz commented on HIVE-10834:
---

Doc note:  This needs to be documented in the wiki for the 1.3.0 release.

* [Windowing and Analytics -- WINDOW clause | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-WINDOWclause]

 Support First_value()/last_value() over x preceding and y preceding windowing
 -

 Key: HIVE-10834
 URL: https://issues.apache.org/jira/browse/HIVE-10834
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu
  Labels: TODOC1.3
 Fix For: 1.3.0

 Attachments: HIVE-10834.patch


 Currently the following query
 {noformat}
 select ts, f, first_value(f) over (partition by ts order by t rows between 2 
 preceding and 1 preceding) from over10k limit 100;
 {noformat}
 throws exception:
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row (tag=0) 
 {key:{reducesinkkey0:2013-03-01 
 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 
 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
 ... 3 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
 cannot generate all output rows for a Partition
 at 
 org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing

2015-06-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10834:
--
Labels: TODOC1.3  (was: )

 Support First_value()/last_value() over x preceding and y preceding windowing
 -

 Key: HIVE-10834
 URL: https://issues.apache.org/jira/browse/HIVE-10834
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu
  Labels: TODOC1.3
 Fix For: 1.3.0

 Attachments: HIVE-10834.patch


 Currently the following query
 {noformat}
 select ts, f, first_value(f) over (partition by ts order by t rows between 2 
 preceding and 1 preceding) from over10k limit 100;
 {noformat}
 throws exception:
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row (tag=0) 
 {key:{reducesinkkey0:2013-03-01 
 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 
 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
 ... 3 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: 
 cannot generate all output rows for a Partition
 at 
 org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
 at 
 org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10885) with vectorization enabled join operation involving interval_day_time fails

2015-06-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572292#comment-14572292
 ] 

Lefty Leverenz commented on HIVE-10885:
---

Note:  The commits have the wrong Jira number -- they say HIVE-10855 instead of 
HIVE-10885.

* Commit to master:  09100831adff7589ee48e735a4beac6ebb25cb3e
* Commit to branch-1.2:  f3ab5fda6af57afff31c29ad048d906fd095d5fb

 with vectorization enabled join operation involving interval_day_time fails
 ---

 Key: HIVE-10885
 URL: https://issues.apache.org/jira/browse/HIVE-10885
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Jagruti Varia
Assignee: Matt McCline
 Fix For: 1.2.1

 Attachments: HIVE-10885.01.patch, HIVE-10885.02.patch, 
 HIVE-10885.03.patch


 When vectorization is on, join operation involving interval_day_time type 
 throws following error:
 {noformat}
 Status: Failed
 Vertex failed, vertexName=Map 2, vertexId=vertex_1432858236614_0247_1_01, 
 diagnostics=[Task failed, taskId=task_1432858236614_0247_1_01_00, 
 diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
   ... 14 more
 Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for 
 interval_day_time
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
   ... 15 more
 ], TaskAttempt 1 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
   at 

[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-04 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572310#comment-14572310
 ] 

Laljo John Pullokkaran commented on HIVE-10841:
---

Never Mind thats the wrong query.
I think i can report the filter not getting in to mapper with the original 
query.

 [WHERE col is not null] does not work sometimes for queries with many JOIN 
 statements
 -

 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
 Attachments: HIVE-10841.patch


 The result from the following SELECT query is 3 rows but it should be 1 row.
 I checked it in MySQL - it returned 1 row.
 To reproduce the issue in Hive
 1. prepare tables
 {code}
 drop table if exists L;
 drop table if exists LA;
 drop table if exists FR;
 drop table if exists A;
 drop table if exists PI;
 drop table if exists acct;
 create table L as select 4436 id;
 create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
 create table FR as select 4436 loan_id;
 create table A as select 4748 id;
 create table PI as select 4415 id;
 create table acct as select 4748 aid, 10 acc_n, 122 brn;
 insert into table acct values(4748, null, null);
 insert into table acct values(4748, null, null);
 {code}
 2. run SELECT query
 {code}
 select
   acct.ACC_N,
   acct.brn
 FROM L
 JOIN LA ON L.id = LA.loan_id
 JOIN FR ON L.id = FR.loan_id
 JOIN A ON LA.aid = A.id
 JOIN PI ON PI.id = LA.pi_id
 JOIN acct ON A.id = acct.aid
 WHERE
   L.id = 4436
   and acct.brn is not null;
 {code}
 the result is 3 rows
 {code}
 10122
 NULL  NULL
 NULL  NULL
 {code}
 but it should be 1 row
 {code}
 10122
 {code}
 2.1 explain select ... output for hive-1.3.0 MR
 {code}
 STAGE DEPENDENCIES:
   Stage-12 is a root stage
   Stage-9 depends on stages: Stage-12
   Stage-0 depends on stages: Stage-9
 STAGE PLANS:
   Stage: Stage-12
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
 acct 
   Fetch Operator
 limit: -1
 fr 
   Fetch Operator
 limit: -1
 l 
   Fetch Operator
 limit: -1
 pi 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 acct 
   TableScan
 alias: acct
 Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
 Column stats: NONE
 Filter Operator
   predicate: aid is not null (type: boolean)
   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 fr 
   TableScan
 alias: fr
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (loan_id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 l 
   TableScan
 alias: l
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 pi 
   TableScan
 alias: pi
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data 

[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository

2015-06-04 Thread Anant Nag (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572528#comment-14572528
 ] 

Anant Nag commented on HIVE-9664:
-

Yes, this can be done. I'll update the wiki to make it more clear. 

 Hive add jar command should be able to download and add jars from a 
 repository
 

 Key: HIVE-9664
 URL: https://issues.apache.org/jira/browse/HIVE-9664
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Anant Nag
Assignee: Anant Nag
  Labels: TODOC1.2, hive, patch
 Fix For: 1.2.0

 Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, 
 HIVE-9664.patch, HIVE-9664.patch


 Currently Hive's add jar command takes a local path to the dependency jar. 
 This clutters the local file-system as users may forget to remove this jar 
 later
 It would be nice if Hive supported a Gradle like notation to download the jar 
 from a repository.
 Example:  add jar org:module:version
 
 It should also be backward compatible and should take jar from the local 
 file-system as well. 
 RB:  https://reviews.apache.org/r/31628/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak

2015-06-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573437#comment-14573437
 ] 

Alan Gates commented on HIVE-10925:
---

Changes making the transaction handler thread local static look good.

 Non-static threadlocals in metastore code can potentially cause memory leak
 ---

 Key: HIVE-10925
 URL: https://issues.apache.org/jira/browse/HIVE-10925
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-10925.1.patch


 There are many places where non-static threadlocals are used. I can't seem to 
 find a good logic for using them. However, they can potentially result in 
 leaking objects if for example they are created in a long running thread 
 every time the thread handles a new session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS

2015-06-04 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10910:
--
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-8065

 Alter table drop partition queries in encrypted zone failing to remove data 
 from HDFS
 -

 Key: HIVE-10910
 URL: https://issues.apache.org/jira/browse/HIVE-10910
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Eugene Koifman

 Alter table query trying to drop partition removes metadata of partition but 
 fails to remove the data from HDFS
 hive create table table_1(name string, age int, gpa double) partitioned by 
 (b string) stored as textfile;
 OK
 Time taken: 0.732 seconds
 hive alter table table_1 add partition (b='2010-10-10');
 OK
 Time taken: 0.496 seconds
 hive show partitions table_1;
 OK
 b=2010-10-10
 Time taken: 0.781 seconds, Fetched: 1 row(s)
 hive alter table table_1 drop partition (b='2010-10-10');
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException 
 Failed to move to trash: 
 hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10
 hive show partitions table_1;
 OK
 Time taken: 0.622 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >