[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns

2015-04-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496203#comment-14496203
 ] 

Aihua Xu commented on HIVE-9580:


Attached the new patch to fix testCliDriver_mapjoin_decimal unit test failure. 
The other failures seem to be unrelated.

 Server returns incorrect result from JOIN ON VARCHAR columns
 

 Key: HIVE-9580
 URL: https://issues.apache.org/jira/browse/HIVE-9580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mike
Assignee: Aihua Xu
 Attachments: HIVE-9580.patch


 The database erroneously returns rows when joining two tables which each 
 contain a VARCHAR column and the join's ON condition uses the equality 
 operator on the VARCHAR columns.
 **The following JDBC method exhibits the problem:
   static void joinIssue() 
   throws SQLException {
   
   String sql;
   int rowsAffected;
   ResultSet rs;
   Statement stmt = con.createStatement();
   String table1_Name = blahtab1;
   String table1A_Name = blahtab1A;
   String table1B_Name = blahtab1B;
   String table2_Name = blahtab2;
   
   try {
   sql = drop table  + table1_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1A_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1A_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1A_Name +  values ('jklmnopqrs');
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1B_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1B_Name + ( +
   VCHARCOL VARCHAR(11)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1B_Name +  values ('jklmnopqrs', 
 99);
 

[jira] [Commented] (HIVE-9252) Linking custom SerDe jar to table definition.

2015-04-15 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496092#comment-14496092
 ] 

Niels Basjes commented on HIVE-9252:


After the initial patch I no longer see anything happening. What is the status?

 Linking custom SerDe jar to table definition.
 -

 Key: HIVE-9252
 URL: https://issues.apache.org/jira/browse/HIVE-9252
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Niels Basjes
Assignee: Ferdinand Xu
 Attachments: HIVE-9252.1.patch


 In HIVE-6047 the option was created that a jar file can be hooked to the 
 definition of a function. (See: [Language Manual DDL: Permanent 
 Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions]
  )
 I propose to add something similar that can be used when defining an external 
 table that relies on a custom Serde (I expect to usually only have the 
 Deserializer).
 Something like this:
 {code}
 CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
 ...
 STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] 
 [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
 {code}
 Using this you can define (and share !!!) a Hive table on top of a custom 
 fileformat without the need to let the IT operations people deploy a custom 
 SerDe jar file on all nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496097#comment-14496097
 ] 

Hive QA commented on HIVE-10306:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725477/HIVE-10306.4.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8694 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
TestOperationLoggingAPIBase - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3441/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3441/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3441/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725477 - PreCommit-HIVE-TRUNK-Build

 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch, HIVE-10306.4.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-9917:
---
Attachment: HIVE-9917.patch

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10036) Writing ORC format big table causes OOM - too many fixed sized stream buffers

2015-04-15 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-10036:

Labels: orcfile  (was: )

 Writing ORC format big table causes OOM - too many fixed sized stream buffers
 -

 Key: HIVE-10036
 URL: https://issues.apache.org/jira/browse/HIVE-10036
 Project: Hive
  Issue Type: Improvement
Reporter: Selina Zhang
Assignee: Selina Zhang
  Labels: orcfile
 Attachments: HIVE-10036.1.patch, HIVE-10036.2.patch, 
 HIVE-10036.3.patch, HIVE-10036.5.patch, HIVE-10036.6.patch


 ORC writer keeps multiple out steams for each column. Each output stream is 
 allocated fixed size ByteBuffer (configurable, default to 256K). For a big 
 table, the memory cost is unbearable. Specially when HCatalog dynamic 
 partition involves, several hundreds files may be open and writing at the 
 same time (same problems for FileSinkOperator). 
 Global ORC memory manager controls the buffer size, but it only got kicked in 
 at 5000 rows interval. An enhancement could be done here, but the problem is 
 reducing the buffer size introduces worse compression and more IOs in read 
 path. Sacrificing the read performance is always not a good choice. 
 I changed the fixed size ByteBuffer to a dynamic growth buffer which up bound 
 to the existing configurable buffer size. Most of the streams does not need 
 large buffer so the performance got improved significantly. Comparing to 
 Facebook's hive-dwrf, I monitored 2x performance gain with this fix. 
 Solving OOM for ORC completely maybe needs lots of effort , but this is 
 definitely a low hanging fruit. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496153#comment-14496153
 ] 

Hive QA commented on HIVE-10331:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725481/HIVE-10331.02.patch

{color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 8688 tests 
executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testReadFormat_0_11[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testReadFormat_0_11[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcNullOptimization.testMultiStripeWithoutNull
org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testOrcSerDeStatsComplex
org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testOrcSerDeStatsComplexOldFormat
org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testSerdeStatsOldFormat
org.apache.hadoop.hive.ql.io.orc.TestOrcSerDeStats.testStringAndBinaryStatistics
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testBetween
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDecimalEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDecimalInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDoubleEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDoubleInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testEquals
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIn
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIntEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIntInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testIsNull
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testLessThan
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testLessThanEquals
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testNullsInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testStringEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testStringInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testTimestampEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testTimestampInBloomFilter

[jira] [Updated] (HIVE-10342) Nested parenthesis for derived table in from clause - is not working

2015-04-15 Thread sanjiv singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sanjiv singh updated HIVE-10342:

Description: 
Hi All,

Nested parenthesis for derived table in from clause, is not working.

See the given query with derived table , and worling perfectly in hive.

select count ( * ) 
from (select distinct em_last_name, em_first_name, em_d_date
   from employee
   UNION ALL
  select distinct cu_last_name, cu_first_name, cu_d_date
   from customer
   UNION ALL
  select distinct cl_last_name, cl_first_name, cl_d_date
   from client
) cool_cust;

When I added additional parenthesis enclosing derived table , it failed in 
parsing. ...It seems HIVE ANTLR grammar is not compatible to such  syntax. 

Failed Query :
###

select count ( * ) 
from ((select distinct em_last_name, em_first_name, em_d_date
   from employee)
   UNION ALL
  (select distinct cu_last_name, cu_first_name, cu_d_date
   from customer)
   UNION ALL
  (select distinct cl_last_name, cl_first_name, cl_d_date
   from client)
) cool_cust;

Exception  :
##

oViableAltException(283@[147:5: ( ( Identifier LPAREN )= 
partitionedTableFunction | tableSource | subQuerySource | virtualTableSource )])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:3625)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1814)
at 
org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1471)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:42804)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.singleSelectStatement(HiveParser.java:40229)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:39914)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39851)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38904)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38780)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1514)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1052)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:389)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:41 cannot recognize input near '(' '(' 'SELECT' 
in from source


  was:
Hi All,

Nested parenthesis for derived table in from clause, is not working.

See the given query with derived table , and worling perfectly in hive.

select count(*) 
from (select distinct em_last_name, em_first_name, em_d_date
   from employee
   UNION ALL
  select distinct cu_last_name, cu_first_name, cu_d_date
   from customer
   UNION ALL
  select distinct cl_last_name, cl_first_name, cl_d_date
   from client
) cool_cust;

When I added additional parenthesis enclosing derived table , it failed in 
parsing. ...It seems HIVE ANTLR grammar is not compatible to such  syntax. 

Failed Query :
###

select count(*) 
from ((select distinct em_last_name, em_first_name, em_d_date
   from employee)
   UNION ALL
  (select distinct cu_last_name, 

[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns

2015-04-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496208#comment-14496208
 ] 

Aihua Xu commented on HIVE-9580:


[~szehon] Can you help review the code change?

 Server returns incorrect result from JOIN ON VARCHAR columns
 

 Key: HIVE-9580
 URL: https://issues.apache.org/jira/browse/HIVE-9580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mike
Assignee: Aihua Xu
 Attachments: HIVE-9580.patch


 The database erroneously returns rows when joining two tables which each 
 contain a VARCHAR column and the join's ON condition uses the equality 
 operator on the VARCHAR columns.
 **The following JDBC method exhibits the problem:
   static void joinIssue() 
   throws SQLException {
   
   String sql;
   int rowsAffected;
   ResultSet rs;
   Statement stmt = con.createStatement();
   String table1_Name = blahtab1;
   String table1A_Name = blahtab1A;
   String table1B_Name = blahtab1B;
   String table2_Name = blahtab2;
   
   try {
   sql = drop table  + table1_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1A_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1A_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1A_Name +  values ('jklmnopqrs');
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1B_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1B_Name + ( +
   VCHARCOL VARCHAR(11)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1B_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   

[jira] [Commented] (HIVE-10288) Cannot call permanent UDFs

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496252#comment-14496252
 ] 

Hive QA commented on HIVE-10288:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725492/HIVE-10288.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8689 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3443/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3443/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3443/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725492 - PreCommit-HIVE-TRUNK-Build

 Cannot call permanent UDFs
 --

 Key: HIVE-10288
 URL: https://issues.apache.org/jira/browse/HIVE-10288
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Nezih Yigitbasi
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10288.1.patch, HIVE-10288.patch


 Just pulled the trunk and built the hive binary. If I create a permanent udf 
 and exit the cli, and then open the cli and try calling the udf it fails with 
 the exception below. However, the call succeeds if I call the udf right after 
 registering the permanent udf (without exiting the cli). The call also 
 succeeds with the apache-hive-1.0.0 release.
 {code}
 15-04-13 17:04:54,004 INFO  org.apache.hadoop.hive.ql.log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=parse 
 start=1428969893115 end=1428969894004 duration=889 
 from=org.apache.hadoop.hive.ql.Driver
 2015-04-13 17:04:54,007 DEBUG org.apache.hadoop.hive.ql.Driver 
 (Driver.java:recordValidTxns(939)) - Encoding valid txns info 
 9223372036854775807:
 2015-04-13 17:04:54,007 INFO  org.apache.hadoop.hive.ql.log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=semanticAnalyze 
 from=org.apache.hadoop.hive.ql.Driver
 2015-04-13 17:04:54,052 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:analyzeInternal(9997)) - Starting Semantic Analysis
 2015-04-13 17:04:54,053 DEBUG org.apache.hadoop.hive.ql.exec.FunctionRegistry 
 (FunctionRegistry.java:getGenericUDAFResolver(942)) - Looking up GenericUDAF: 
 hour_now
 2015-04-13 17:04:54,053 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:genResolvedParseTree(9980)) - Completed phase 1 of 
 Semantic Analysis
 2015-04-13 17:04:54,053 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:getMetaData(1530)) - Get metadata for source tables
 2015-04-13 17:04:54,054 INFO  

[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-15 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496376#comment-14496376
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

RB link : https://reviews.apache.org/r/33224/

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496425#comment-14496425
 ] 

Hive QA commented on HIVE-10324:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725504/HIVE-10324.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8688 tests 
executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testUpdateFSRootLocation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3444/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3444/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3444/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725504 - PreCommit-HIVE-TRUNK-Build

 Hive metatool should take table_param_key to allow for changes to avro 
 serde's schema url key
 -

 Key: HIVE-10324
 URL: https://issues.apache.org/jira/browse/HIVE-10324
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Ferdinand Xu
 Attachments: HIVE-10324.patch, HIVE-10324.patch.WIP


 HIVE-3443 added support to change the serdeParams from 'metatool 
 updateLocation' command.
 However, in avro it is possible to specify the schema via the tableParams:
 {noformat}
 CREATE  TABLE `testavro`(
   `test` string COMMENT 'from deserializer')
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 TBLPROPERTIES (
   'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 
   'kite.compression.type'='snappy', 
   'transient_lastDdlTime'='1427996456')
 {noformat}
 Hence for those tables the 'metatool updateLocation' will not help.
 This is necessary in case like upgrade the namenode to HA where the absolute 
 paths have changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10307) Support to use number literals in partition column

2015-04-15 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-10307:
---
Attachment: HIVE-10307.1.patch

Fixed for failed tests.

 Support to use number literals in partition column
 --

 Key: HIVE-10307
 URL: https://issues.apache.org/jira/browse/HIVE-10307
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 1.0.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Attachments: HIVE-10307.1.patch, HIVE-10307.patch


 Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as 
 literals with postfix like Y, S, L, or BD appended to the number. These 
 literals work in most Hive queries, but do not when they are used as 
 partition column value. For a partitioned table like:
 create table partcoltypenum (key int, value string) partitioned by (tint 
 tinyint, sint smallint, bint bigint);
 insert into partcoltypenum partition (tint=100Y, sint=1S, 
 bint=1000L) select key, value from src limit 30;
 Queries like select, describe and drop partition do not work. For an example
 select * from partcoltypenum where tint=100Y and sint=1S and 
 bint=1000L;
 does not return any rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-15 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496403#comment-14496403
 ] 

Sushanth Sowmyan commented on HIVE-5672:


Adding additional doc note here : 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintothefilesystemfromqueries
 needs to be updated to note that delimiters are not currently supported for 
non-LOCAL writes, and once this patch goes in, we should note which version 
fixed that in that doc.

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10228:

Description: 
We need to update a couple of hive commands to support replication semantics. 
To wit, we need the following:

EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]

Export will now support an extra optional clause to tell it that this export is 
being prepared for the purpose of replication. There is also an additional 
optional clause here, that allows for the export to be a metadata-only export, 
to handle cases of capturing the diff for alter statements, for example.

Also, if done for replication, the non-presence of a table, or a table being a 
view/offline table/non-native table is not considered an error, and instead, 
will result in a successful no-op.

IMPORT ... (as normal) – but handles new semantics 

No syntax changes for import, but import will have to change to be able to 
handle all the permutations of export dumps possible. Also, import will have to 
ensure that it should update the object only if the update being imported is 
not older than the state of the object. Also, import currently does not work 
with dbname.tablename kind of specification, this should be fixed to work.

DROP TABLE ... FOR REPLICATION('eventid')

Drop Table now has an additional clause, to specify that this drop table is 
being done for replication purposes, and that the dop should not actually drop 
the table if the table is newer than that event id specified.

ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')

Similarly, Drop Partition also has an equivalent change to Drop Table.

=

In addition, we introduce a new property repl.last.id, which when tagged on 
to table properties or partition properties on a replication-destination, holds 
the effective state identifier of the object.

  was:
We need to update a couple of hive commands to support replication semantics. 
To wit, we need the following:

EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]

Export will now support an extra optional clause to tell it that this export is 
being prepared for the purpose of replication. There is also an additional 
optional clause here, that allows for the export to be a metadata-only export, 
to handle cases of capturing the diff for alter statements, for example.

Also, if done for replication, the non-presence of a table, or a table being a 
view/offline table/non-native table is not considered an error, and instead, 
will result in a successful no-op.

IMPORT ... (as normal) – but handles new semantics 

No syntax changes for import, but import will have to change to be able to 
handle all the permutations of export dumps possible. Also, import will have to 
ensure that it should update the object only if the update being imported is 
not older than the state of the object.

DROP TABLE ... FOR REPLICATION('eventid')

Drop Table now has an additional clause, to specify that this drop table is 
being done for replication purposes, and that the dop should not actually drop 
the table if the table is newer than that event id specified.

ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')

Similarly, Drop Partition also has an equivalent change to Drop Table.

=

In addition, we introduce a new property repl.last.id, which when tagged on 
to table properties or partition properties on a replication-destination, holds 
the effective state identifier of the object.


 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 

[jira] [Updated] (HIVE-10310) Support GROUPING() in HIVE

2015-04-15 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10310:
--
Summary: Support GROUPING() in HIVE  (was: Support GROUPING() and 
GROUP_ID() in HIVE)

 Support GROUPING() in HIVE
 --

 Key: HIVE-10310
 URL: https://issues.apache.org/jira/browse/HIVE-10310
 Project: Hive
  Issue Type: New Feature
  Components: Parser, SQL
Reporter: sanjiv singh
Priority: Minor

 I have lots of queries using GROUPING() function. failing on hive , just 
 because GROUPING() not supported in hive. See the Query below;  
 SELECT fact_1_id,
fact_2_id,
GROUPING(fact_1_id) AS f1g, 
GROUPING(fact_2_id) AS f2g
 FROM   dimension_tab
 GROUP BY CUBE (fact_1_id, fact_2_id)
 ORDER BY fact_1_id, fact_2_id;
 In order to run in HIVE all such queries, It  need to be transformed to HIVE 
 syntax. See below transformed query, compatible to hive. Equivalent have been 
 derived using Case statement .   
 SELECT fact_1_id,
fact_2_id,
(case when (GROUPING__ID  1) = 0 then 1 else 0 end) as f1g,
(case when (GROUPING__ID  2) = 0 then 1 else 0 end) as f2g
 FROM   dimension_tab
 GROUP BY fact_1_id, fact_2_id WITH CUBE
 ORDER BY fact_1_id, fact_2_id;
 It would be great if GROUPING() implemented in hive. I see two ways to do it
 1) Handle it at parser level.
 2) GROUPING() aggregate function to hive(recommended)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10288) Cannot call permanent UDFs

2015-04-15 Thread Nezih Yigitbasi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496468#comment-14496468
 ] 

Nezih Yigitbasi commented on HIVE-10288:


Thanks [~jdere] and [~chinnalalam] for the quick turnaround. I also verified 
the patch with several tests and it seems to solve this issue.

 Cannot call permanent UDFs
 --

 Key: HIVE-10288
 URL: https://issues.apache.org/jira/browse/HIVE-10288
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Nezih Yigitbasi
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10288.1.patch, HIVE-10288.patch


 Just pulled the trunk and built the hive binary. If I create a permanent udf 
 and exit the cli, and then open the cli and try calling the udf it fails with 
 the exception below. However, the call succeeds if I call the udf right after 
 registering the permanent udf (without exiting the cli). The call also 
 succeeds with the apache-hive-1.0.0 release.
 {code}
 15-04-13 17:04:54,004 INFO  org.apache.hadoop.hive.ql.log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=parse 
 start=1428969893115 end=1428969894004 duration=889 
 from=org.apache.hadoop.hive.ql.Driver
 2015-04-13 17:04:54,007 DEBUG org.apache.hadoop.hive.ql.Driver 
 (Driver.java:recordValidTxns(939)) - Encoding valid txns info 
 9223372036854775807:
 2015-04-13 17:04:54,007 INFO  org.apache.hadoop.hive.ql.log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=semanticAnalyze 
 from=org.apache.hadoop.hive.ql.Driver
 2015-04-13 17:04:54,052 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:analyzeInternal(9997)) - Starting Semantic Analysis
 2015-04-13 17:04:54,053 DEBUG org.apache.hadoop.hive.ql.exec.FunctionRegistry 
 (FunctionRegistry.java:getGenericUDAFResolver(942)) - Looking up GenericUDAF: 
 hour_now
 2015-04-13 17:04:54,053 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:genResolvedParseTree(9980)) - Completed phase 1 of 
 Semantic Analysis
 2015-04-13 17:04:54,053 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:getMetaData(1530)) - Get metadata for source tables
 2015-04-13 17:04:54,054 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore 
 (HiveMetaStore.java:logInfo(744)) - 0: get_table : db=default tbl=test_table
 2015-04-13 17:04:54,054 INFO  
 org.apache.hadoop.hive.metastore.HiveMetaStore.audit 
 (HiveMetaStore.java:logAuditEvent(369)) - ugi=nyigitbasi   ip=unknown-ip-addr 
  cmd=get_table : db=default tbl=test_table
 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore 
 (ObjectStore.java:debugLog(6776)) - Open transaction: count = 1, isActive = 
 true at:
   
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:927)
 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore 
 (ObjectStore.java:debugLog(6776)) - Open transaction: count = 2, isActive = 
 true at:
   
 org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:990)
 2015-04-13 17:04:54,104 DEBUG org.apache.hadoop.hive.metastore.ObjectStore 
 (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 1, isactive 
 true at:
   
 org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:998)
 2015-04-13 17:04:54,232 DEBUG org.apache.hadoop.hive.metastore.ObjectStore 
 (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 0, isactive 
 true at:
   
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:929)
 2015-04-13 17:04:54,242 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:getMetaData(1682)) - Get metadata for subqueries
 2015-04-13 17:04:54,247 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:getMetaData(1706)) - Get metadata for destination 
 tables
 2015-04-13 17:04:54,256 INFO  org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:genResolvedParseTree(9984)) - Completed getting 
 MetaData in Semantic Analysis
 2015-04-13 17:04:54,259 INFO  
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer 
 (CalcitePlanner.java:canHandleAstForCbo(369)) - Not invoking CBO because the 
 statement has too few joins
 2015-04-13 17:04:54,344 DEBUG 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 
 (LazySimpleSerDe.java:initialize(135)) - 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: 
 columnNames=[_c0, _c1] columnTypes=[int, int] separator=[[B@6e6d4780] 
 nullstring=\N lastColumnTakesRest=false timestampFormats=null
 2015-04-13 17:04:54,406 DEBUG org.apache.hadoop.hive.ql.parse.CalcitePlanner 
 (SemanticAnalyzer.java:genTablePlan(9458)) - Created Table Plan for 
 test_table TS[0]
 2015-04-13 17:04:54,410 DEBUG 

[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10239:
---
Attachment: HIVE-10239.0.patch

Re-uploading patch to start jenkins tests.

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases

2015-04-15 Thread Nezih Yigitbasi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nezih Yigitbasi updated HIVE-10319:
---
Attachment: HIVE-10319.patch

 Hive CLI startup takes a long time with a large number of databases
 ---

 Key: HIVE-10319
 URL: https://issues.apache.org/jira/browse/HIVE-10319
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.0.0
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-10319.patch


 The Hive CLI takes a long time to start when there is a large number of 
 databases in the DW. I think the root cause is the way permanent UDFs are 
 loaded from the metastore. When I looked at the logs and the source code I 
 see that at startup Hive first gets all the databases from the metastore and 
 then for each database it makes a metastore call to get the permanent 
 functions for that database [see Hive.java | 
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185].
  So the number of metastore calls made is in the order of the number of 
 databases. In production we have several hundreds of databases so Hive makes 
 several hundreds of RPC calls during startup, taking 30+ seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10341) CBO (Calcite Return Path): TraitSets not correctly propagated in HiveSortExchange causes Assertion error

2015-04-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-10341.
-
Resolution: Fixed

Committed to branch. Thanks, Jesus!

 CBO (Calcite Return Path): TraitSets not correctly propagated in 
 HiveSortExchange causes Assertion error
 

 Key: HIVE-10341
 URL: https://issues.apache.org/jira/browse/HIVE-10341
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10341.cbo.patch


 When return path is on ({{hive.cbo.returnpath.hiveop=true}}), the TraitSets 
 are not correctly set up by HiveSortExchange. For instance, 
 correlationoptimizer14.q produces the following exception:
 {noformat}
 Unexpected exception java.lang.AssertionError: traits=NONE.[], collation=[0]
  at org.apache.calcite.rel.core.SortExchange.init(SortExchange.java:63)
  at 
 org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortExchange.init(HiveSortExchange.java:18)
  at 
 org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveSortExchange.create(HiveSortExchange.java:39)
  at 
 org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveInsertExchange4JoinRule.onMatch(HiveInsertExchange4JoinRule.java:95)
  at 
 org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:326)
  at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:515)
  at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392)
  at 
 org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255)
  at 
 org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125)
  at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207)
  at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10270) Cannot use Decimal constants less than 0.1BD

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496529#comment-14496529
 ] 

Hive QA commented on HIVE-10270:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725506/HIVE-10270.4.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8690 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams
org.apache.hadoop.hive.serde2.binarysortable.TestBinarySortableFast.testBinarySortableFast
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3445/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3445/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3445/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725506 - PreCommit-HIVE-TRUNK-Build

 Cannot use Decimal constants less than 0.1BD
 

 Key: HIVE-10270
 URL: https://issues.apache.org/jira/browse/HIVE-10270
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, 
 HIVE-10270.3.patch, HIVE-10270.4.patch


 {noformat}
 hive select 0.09765625BD;
 FAILED: IllegalArgumentException Decimal scale must be less than or equal to 
 precision
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9923) No clear message when from is missing

2015-04-15 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496451#comment-14496451
 ] 

Yongzhi Chen commented on HIVE-9923:


The NullPointerException stack is:
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:40882)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40059)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:39929)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1574)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1093)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:396)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1116)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
... 27 more

{noformat}
It is from HiveParser.java:
{noformat}
   if ( state.backtracking==0 ) 
{(s!=null?((CommonTree)s.tree):null).getChild(1).replaceChildren(0, 0, 
(i!=null?((CommonTree)i.tree):null));}
{noformat}
Where there is no from key word, the getChild(1) will be null, then the 
exception thrown.
When insert with select statement, a from should be required not optional. 
Make the parser change to let it error out before reach getChild(1). 


 No clear message when from is missing
 ---

 Key: HIVE-9923
 URL: https://issues.apache.org/jira/browse/HIVE-9923
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Jeff Zhang
Assignee: Yongzhi Chen
 Attachments: HIVE-9923.1.patch


 For the following sql, from is missing but it throw NPE which is not clear 
 for user.
 {code}
 hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age 
 student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name;
 FAILED: NullPointerException null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9923) No clear message when from is missing

2015-04-15 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-9923:
---
Attachment: HIVE-9923.1.patch

 No clear message when from is missing
 ---

 Key: HIVE-9923
 URL: https://issues.apache.org/jira/browse/HIVE-9923
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Jeff Zhang
Assignee: Yongzhi Chen
 Attachments: HIVE-9923.1.patch


 For the following sql, from is missing but it throw NPE which is not clear 
 for user.
 {code}
 hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age 
 student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name;
 FAILED: NullPointerException null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8136) Reduce table locking

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496651#comment-14496651
 ] 

Hive QA commented on HIVE-8136:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725511/HIVE-8136.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8689 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3446/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3446/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3446/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725511 - PreCommit-HIVE-TRUNK-Build

 Reduce table locking
 

 Key: HIVE-8136
 URL: https://issues.apache.org/jira/browse/HIVE-8136
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: HIVE-8136.1.patch, HIVE-8136.patch


 When using ZK for concurrency control, some statements require an exclusive 
 table lock when they are atomic. Such as setting a tables location.
 This JIRA is to analyze the scope of statements like ALTER TABLE and see if 
 we can reduce the locking required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10329) Hadoop reflectionutils has issues

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10329:

Summary: Hadoop reflectionutils has issues  (was: LLAP: Hadoop 
reflectionutils has issues)

 Hadoop reflectionutils has issues
 -

 Key: HIVE-10329
 URL: https://issues.apache.org/jira/browse/HIVE-10329
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10329.patch


 1) Constructor cache leaks classes and their attendant static overhead 
 forever.
 2) Class cache inside conf used when getting JobConfigurable classes has an 
 epic lock.
 Both bugs are files in Hadoop but will hardly ever be fixed at this rate. 
 This version avoids both problems



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496571#comment-14496571
 ] 

Prasanth Jayachandran commented on HIVE-10331:
--

Actually there is more to this issue, you might need to set the hasNull default 
back to false as setNull() explicitly changes it to true whenever a null is 
encountered for a column which is correct. The wrong part is not the 
initialization but the condition when hasNull is missing.
Can you change the initialization of hasNull same as the old one and add an 
else condition of hasHasNull() check which returns true when hasNull protobuf 
field is missing?

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10340) Enable ORC test for timezone reading from old format

2015-04-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496632#comment-14496632
 ] 

Sergey Shelukhin commented on HIVE-10340:
-

+1

 Enable ORC test for timezone reading from old format
 

 Key: HIVE-10340
 URL: https://issues.apache.org/jira/browse/HIVE-10340
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Trivial
 Attachments: HIVE-10340.1.patch


 As a part of HIVE-8746 I added a test for reading timezone data from old ORC 
 format that was unintentionally disabled. Re-enable the test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.

2015-04-15 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496690#comment-14496690
 ] 

Vaibhav Gumashta commented on HIVE-9710:


+1. 

Thanks for patiently iterating [~hsubramaniyan].

 HiveServer2 should support cookie based authentication, when using HTTP 
 transport.
 --

 Key: HIVE-9710
 URL: https://issues.apache.org/jira/browse/HIVE-9710
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 1.2.0
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, 
 HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch, 
 HIVE-9710.8.patch


 HiveServer2 should generate cookies and validate the client cookie send to it 
 so that it need not perform User/Password or a Kerberos based authentication 
 on each HTTP request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns

2015-04-15 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496681#comment-14496681
 ] 

Szehon Ho commented on HIVE-9580:
-

Hi Aihua, looks like this works, so you are making all the varchars (and even 
char) in the join comparison to be the maximum length to avoid this issue.

But I'm not too familiar with this code, I think [~jdere] is the varchars 
expert, forwarding to him to take a look as well.

 Server returns incorrect result from JOIN ON VARCHAR columns
 

 Key: HIVE-9580
 URL: https://issues.apache.org/jira/browse/HIVE-9580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mike
Assignee: Aihua Xu
 Attachments: HIVE-9580.patch


 The database erroneously returns rows when joining two tables which each 
 contain a VARCHAR column and the join's ON condition uses the equality 
 operator on the VARCHAR columns.
 **The following JDBC method exhibits the problem:
   static void joinIssue() 
   throws SQLException {
   
   String sql;
   int rowsAffected;
   ResultSet rs;
   Statement stmt = con.createStatement();
   String table1_Name = blahtab1;
   String table1A_Name = blahtab1A;
   String table1B_Name = blahtab1B;
   String table2_Name = blahtab2;
   
   try {
   sql = drop table  + table1_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1A_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1A_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1A_Name +  values ('jklmnopqrs');
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1B_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1B_Name + ( +
   VCHARCOL VARCHAR(11)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   

[jira] [Updated] (HIVE-10269) HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor

2015-04-15 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10269:

Fix Version/s: 1.2.0

 HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor
 -

 Key: HIVE-10269
 URL: https://issues.apache.org/jira/browse/HIVE-10269
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Gabor Liptak
Assignee: Ferdinand Xu
 Fix For: 1.2.0

 Attachments: HIVE-10269.patch


 Compiling trunk fails when building based on instructions in
 https://cwiki.apache.org/confluence/display/Hive/HowToContribute
 $ git status
 On branch trunk
 Your branch is up-to-date with 'origin/trunk'.
 nothing to commit, working directory clean
 $ mvn clean install -DskipTests -Phadoop-1
 ...[ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-metastore: Compilation failure: Compilation failure:
 [ERROR] 
 /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6089,29]
  cannot find symbol
 [ERROR] symbol:   class JvmPauseMonitor
 [ERROR] location: package org.apache.hadoop.util
 [ERROR] 
 /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6090,35]
  cannot find symbol
 [ERROR] symbol:   class JvmPauseMonitor
 [ERROR] location: package org.apache.hadoop.util
 [ERROR] - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
 [ERROR] 
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :hive-metastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10269) HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor

2015-04-15 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-10269:

Affects Version/s: 1.2.0

 HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor
 -

 Key: HIVE-10269
 URL: https://issues.apache.org/jira/browse/HIVE-10269
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
Reporter: Gabor Liptak
Assignee: Ferdinand Xu
 Fix For: 1.2.0

 Attachments: HIVE-10269.patch


 Compiling trunk fails when building based on instructions in
 https://cwiki.apache.org/confluence/display/Hive/HowToContribute
 $ git status
 On branch trunk
 Your branch is up-to-date with 'origin/trunk'.
 nothing to commit, working directory clean
 $ mvn clean install -DskipTests -Phadoop-1
 ...[ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-metastore: Compilation failure: Compilation failure:
 [ERROR] 
 /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6089,29]
  cannot find symbol
 [ERROR] symbol:   class JvmPauseMonitor
 [ERROR] location: package org.apache.hadoop.util
 [ERROR] 
 /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6090,35]
  cannot find symbol
 [ERROR] symbol:   class JvmPauseMonitor
 [ERROR] location: package org.apache.hadoop.util
 [ERROR] - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
 [ERROR] 
 [ERROR] After correcting the problems, you can resume the build with the 
 command
 [ERROR]   mvn goals -rf :hive-metastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function

2015-04-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10344:
---
Summary: CBO (Calcite Return Path): Use newInstance to create 
ExprNodeGenericFuncDesc rather than construction function  (was: Use 
newInstance to create ExprNodeGenericFuncDesc rather than construction function)

 CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc 
 rather than construction function
 --

 Key: HIVE-10344
 URL: https://issues.apache.org/jira/browse/HIVE-10344
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0


 ExprNodeGenericFuncDesc is now created using a constructer. It skips the 
 initialization step genericUDF.initializeAndFoldConstants compared with 
 using newInstance method. If the initialization step is skipped, some 
 configuration parameters are not included in the serialization which 
 generates wrong results/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10332) CBO (Calcite Return Path): Use SortExchange rather than LogicalExchange for HiveOpConverter

2015-04-15 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10332:
---
Summary: CBO (Calcite Return Path): Use SortExchange rather than 
LogicalExchange for HiveOpConverter  (was: Use SortExchange rather than 
LogicalExchange for HiveOpConverter)

 CBO (Calcite Return Path): Use SortExchange rather than LogicalExchange for 
 HiveOpConverter
 ---

 Key: HIVE-10332
 URL: https://issues.apache.org/jira/browse/HIVE-10332
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: cbo-branch

 Attachments: HIVE-10332.01.patch


 Right now HiveSortExchange extends SortExchange extends Exchange. 
 LogicalExchange extends Exchange. LogicalExchange is expected in 
 HiveOpConverter but HiveSortExchange is created. After discussion, we plan to 
 change LogicalExchange to HiveSortExchange.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function

2015-04-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496884#comment-14496884
 ] 

Pengcheng Xiong commented on HIVE-10344:


[~jpullokkaran], after this patch, the cbo_simple_select got passed when return 
path is turned on.

 CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc 
 rather than construction function
 --

 Key: HIVE-10344
 URL: https://issues.apache.org/jira/browse/HIVE-10344
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0

 Attachments: HIVE-10344.01.patch


 ExprNodeGenericFuncDesc is now created using a constructer. It skips the 
 initialization step genericUDF.initializeAndFoldConstants compared with 
 using newInstance method. If the initialization step is skipped, some 
 configuration parameters are not included in the serialization which 
 generates wrong results/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10270) Cannot use Decimal constants less than 0.1BD

2015-04-15 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10270:
--
Attachment: HIVE-10270.5.patch

TestBinarySortableFast failure was due to the fact that HIVE-9937 had 
duplicated some BinarySortableSerDe serialization logic including for decimals.
For Patch v5 I have refactored it so both BinarySortableSerDe and 
BinarySortableSerializeWrite both call into the same common logic, and updated 
the tests for TestBinarySortableFast, similar to TestBinarySortableSerDe.

 Cannot use Decimal constants less than 0.1BD
 

 Key: HIVE-10270
 URL: https://issues.apache.org/jira/browse/HIVE-10270
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, 
 HIVE-10270.3.patch, HIVE-10270.4.patch, HIVE-10270.5.patch


 {noformat}
 hive select 0.09765625BD;
 FAILED: IllegalArgumentException Decimal scale must be less than or equal to 
 precision
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-04-15 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-9617:
-
Fix Version/s: 1.2.0

 UDF from_utc_timestamp throws NPE if the second argument is null
 

 Key: HIVE-9617
 URL: https://issues.apache.org/jira/browse/HIVE-9617
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9617.1.patch, HIVE-9617.2.patch


 UDF from_utc_timestamp throws NPE if the second argument is null
 {code}
 select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string));
 FAILED: NullPointerException null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function

2015-04-15 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496924#comment-14496924
 ] 

Laljo John Pullokkaran commented on HIVE-10344:
---

[~ashutoshc] Could you review and check this in to CBO branch?


 CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc 
 rather than construction function
 --

 Key: HIVE-10344
 URL: https://issues.apache.org/jira/browse/HIVE-10344
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0

 Attachments: HIVE-10344.01.patch


 ExprNodeGenericFuncDesc is now created using a constructer. It skips the 
 initialization step genericUDF.initializeAndFoldConstants compared with 
 using newInstance method. If the initialization step is skipped, some 
 configuration parameters are not included in the serialization which 
 generates wrong results/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496784#comment-14496784
 ] 

Hive QA commented on HIVE-9917:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725574/HIVE-9917.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8692 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3447/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3447/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3447/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725574 - PreCommit-HIVE-TRUNK-Build

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns

2015-04-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496785#comment-14496785
 ] 

Aihua Xu commented on HIVE-9580:


That's right. For the key comparison, it will call UDF to do the key conversion 
if we are comparing different types, or I think we should pick the common type 
as the key type if the type conversion is not needed for the data types 
including char or varchar with different lengths.

 Server returns incorrect result from JOIN ON VARCHAR columns
 

 Key: HIVE-9580
 URL: https://issues.apache.org/jira/browse/HIVE-9580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mike
Assignee: Aihua Xu
 Attachments: HIVE-9580.patch


 The database erroneously returns rows when joining two tables which each 
 contain a VARCHAR column and the join's ON condition uses the equality 
 operator on the VARCHAR columns.
 **The following JDBC method exhibits the problem:
   static void joinIssue() 
   throws SQLException {
   
   String sql;
   int rowsAffected;
   ResultSet rs;
   Statement stmt = con.createStatement();
   String table1_Name = blahtab1;
   String table1A_Name = blahtab1A;
   String table1B_Name = blahtab1B;
   String table2_Name = blahtab2;
   
   try {
   sql = drop table  + table1_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1A_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1A_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1A_Name +  values ('jklmnopqrs');
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1B_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1B_Name + ( +
   VCHARCOL VARCHAR(11)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   

[jira] [Updated] (HIVE-10343) CBO (Calcite Return Path): Parameterize algorithm cost model

2015-04-15 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10343:
--
Attachment: HIVE-10343.patch

 CBO (Calcite Return Path): Parameterize algorithm cost model
 

 Key: HIVE-10343
 URL: https://issues.apache.org/jira/browse/HIVE-10343
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10343.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10284) enable container reuse for grace hash join

2015-04-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496750#comment-14496750
 ] 

Matt McCline commented on HIVE-10284:
-

I'm not sure what is going on here.

Probably I think we are forming the vector expression writers incorrectly in 
the new code we added.  I need to go study the code and think.

 enable container reuse for grace hash join 
 ---

 Key: HIVE-10284
 URL: https://issues.apache.org/jira/browse/HIVE-10284
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Wei Zheng
 Attachments: HIVE-10284.1.patch, HIVE-10284.2.patch, 
 HIVE-10284.3.patch, HIVE-10284.4.patch, HIVE-10284.5.patch, 
 HIVE-10284.6.patch, HIVE-10284.7.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key

2015-04-15 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496694#comment-14496694
 ] 

Szehon Ho commented on HIVE-10324:
--

Thanks Ferdinand for taking care of this.  Can we keep the update of any 
property that match StorageDescriptor property, and just add another method for 
Table property?  I am afraid that somebody might be using this, unless we can 
confirm that that StorageDescriptor property is never used.

 Hive metatool should take table_param_key to allow for changes to avro 
 serde's schema url key
 -

 Key: HIVE-10324
 URL: https://issues.apache.org/jira/browse/HIVE-10324
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Ferdinand Xu
 Attachments: HIVE-10324.patch, HIVE-10324.patch.WIP


 HIVE-3443 added support to change the serdeParams from 'metatool 
 updateLocation' command.
 However, in avro it is possible to specify the schema via the tableParams:
 {noformat}
 CREATE  TABLE `testavro`(
   `test` string COMMENT 'from deserializer')
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 TBLPROPERTIES (
   'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 
   'kite.compression.type'='snappy', 
   'transient_lastDdlTime'='1427996456')
 {noformat}
 Hence for those tables the 'metatool updateLocation' will not help.
 This is necessary in case like upgrade the namenode to HA where the absolute 
 paths have changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10273) Union with partition tables which have no data fails with NPE

2015-04-15 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10273:
--
Attachment: HIVE-10273.6.patch

Updated to latest trunk. Test failures unrelated.

 Union with partition tables which have no data fails with NPE
 -

 Key: HIVE-10273
 URL: https://issues.apache.org/jira/browse/HIVE-10273
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10273.1.patch, HIVE-10273.2.patch, 
 HIVE-10273.3.patch, HIVE-10273.4.patch, HIVE-10273.5.patch, HIVE-10273.6.patch


 As shown in the test case in the patch below, when we have partitioned tables 
 which have no data, we fail with an NPE with the following stack trace:
 {code}
 NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateMapWork(Vectorizer.java:357)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:321)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:307)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:847)
   at 
 org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:468)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:223)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-15 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: HIVE-10233-WIP-2.patch

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10350:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9132

 CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
 -

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974
 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 
 892.8571428571428, 

[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10350:
---
Attachment: HIVE-10331.01.patch

 CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
 -

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974
 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 
 

[jira] [Updated] (HIVE-10346) Tez on HBase has problems with settings again

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10346:

Attachment: HIVE-10346.patch

 Tez on HBase has problems with settings again
 -

 Key: HIVE-10346
 URL: https://issues.apache.org/jira/browse/HIVE-10346
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10346.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10346) Tez on HBase has problems with settings again

2015-04-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497089#comment-14497089
 ] 

Sergey Shelukhin commented on HIVE-10346:
-

[~hagleitn] can you please review?

 Tez on HBase has problems with settings again
 -

 Key: HIVE-10346
 URL: https://issues.apache.org/jira/browse/HIVE-10346
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10346.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10347) Merge spark to trunk 4/15/2015

2015-04-15 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-10347:
-
Attachment: HIVE-10347.patch

Attaching to run precommit tests.

 Merge spark to trunk 4/15/2015
 --

 Key: HIVE-10347
 URL: https://issues.apache.org/jira/browse/HIVE-10347
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-10347.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons

2015-04-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10028:
-
Attachment: HIVE-10028.2.patch

 LLAP: Create a fixed size execution queue for daemons
 -

 Key: HIVE-10028
 URL: https://issues.apache.org/jira/browse/HIVE-10028
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Prasanth Jayachandran
 Fix For: llap

 Attachments: HIVE-10028.1.patch, HIVE-10028.2.patch


 Currently, this is unbounded. This should be a configurable size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10307) Support to use number literals in partition column

2015-04-15 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497157#comment-14497157
 ] 

Chaoyu Tang commented on HIVE-10307:


The failed tests seems not related to this patch. Thanks

 Support to use number literals in partition column
 --

 Key: HIVE-10307
 URL: https://issues.apache.org/jira/browse/HIVE-10307
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 1.0.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Attachments: HIVE-10307.1.patch, HIVE-10307.patch


 Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as 
 literals with postfix like Y, S, L, or BD appended to the number. These 
 literals work in most Hive queries, but do not when they are used as 
 partition column value. For a partitioned table like:
 create table partcoltypenum (key int, value string) partitioned by (tint 
 tinyint, sint smallint, bint bigint);
 insert into partcoltypenum partition (tint=100Y, sint=1S, 
 bint=1000L) select key, value from src limit 30;
 Queries like select, describe and drop partition do not work. For an example
 select * from partcoltypenum where tint=100Y and sint=1S and 
 bint=1000L;
 does not return any rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative

2015-04-15 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497228#comment-14497228
 ] 

Mostafa Mokhtar commented on HIVE-10350:


[~jcamachorodriguez] [~jpullokkaran]
Can you please take a look?

 CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
 -

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id 

[jira] [Commented] (HIVE-9923) No clear message when from is missing

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497327#comment-14497327
 ] 

Hive QA commented on HIVE-9923:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725615/HIVE-9923.1.patch

{color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 8690 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_literal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_add_months
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftleft
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftright
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_bitwise_shiftrightunsigned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_cbrt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_date_add
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_date_sub
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_decode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_factorial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_format_number
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_from_utc_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_get_json_object
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_last_day
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_levenshtein
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_months_between
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_soundex
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_utc_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_trunc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_stack
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ptf_negative_DistributeByOrderBy
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ptf_negative_PartitionBySortBy
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_star_suffix
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_select_udtf_alias
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_subquery_missing_from
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_timestamp_literal
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_1

[jira] [Commented] (HIVE-10290) Add negative test case to modify a non-existent config value when hive security authorization is enabled.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497107#comment-14497107
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10290:
--

[~thejas] Is it possible to get this in ?

Thanks
Hari

 Add negative test case to modify a non-existent config value when hive 
 security authorization is enabled.
 -

 Key: HIVE-10290
 URL: https://issues.apache.org/jira/browse/HIVE-10290
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10290.1.patch


 We need to have a test case to cover the following scenario when hive 
 security authorization is enabled:
 {code}
 set hive.exec.reduce.max=1;
 Query returned non-zero code: 1, cause: hive configuration 
 hive.exec.reduce.max does not exists.
 {code}
 This is important for ease-of-use and we need to prevent  future code 
 change/regression which might convert the above test case to throw a 
 permission denied error. 
 i.e, the below output is not desirable :
 {code}
 set hive.exec.reduce.max=1;
 Error: Error while processing statement: Cannot modify hive.exec.reduce.max 
 at runtime. It is not in list of params that are allowed to be modified at 
 runtime (state=42000,code=1)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10348) LLAP: merge trunk to branch 2015-04-15

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10348.
-
   Resolution: Fixed
Fix Version/s: llap

 LLAP: merge trunk to branch 2015-04-15
 --

 Key: HIVE-10348
 URL: https://issues.apache.org/jira/browse/HIVE-10348
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10331:
---
Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-9132)

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10350:
---
Attachment: HIVE-10331.01.patch

 CBO: With hive.cbo.costmodel.extended enabled IO cost is negative
 -

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974
 HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 
 

[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI

2015-04-15 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497116#comment-14497116
 ] 

Szehon Ho commented on HIVE-10304:
--

Done editing these sections with new information or links on Beeline/HS2, as 
well as deprecation warnings of HiveCLI, feel free to check.  Thanks Lefty for 
the links.

 Add deprecation message to HiveCLI
 --

 Key: HIVE-10304
 URL: https://issues.apache.org/jira/browse/HIVE-10304
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Szehon Ho
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch


 As Beeline is now the recommended command line tool to Hive, we should add a 
 message to HiveCLI to indicate that it is deprecated and redirect them to 
 Beeline.  
 This is not suggesting to remove HiveCLI for now, but just a helpful 
 direction for user to know the direction to focus attention in Beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10335.
-
Resolution: Not A Problem

 LLAP: IndexOutOfBound in MapJoinOperator
 

 Key: HIVE-10335
 URL: https://issues.apache.org/jira/browse/HIVE-10335
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Sergey Shelukhin
 Fix For: llap


 {code}
 2015-04-14 13:57:55,889 
 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map
  1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected 
 exception: Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:653)
 at java.util.ArrayList.get(ArrayList.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10335.
-
Resolution: Done

 LLAP: IndexOutOfBound in MapJoinOperator
 

 Key: HIVE-10335
 URL: https://issues.apache.org/jira/browse/HIVE-10335
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Sergey Shelukhin
 Fix For: llap


 {code}
 2015-04-14 13:57:55,889 
 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map
  1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected 
 exception: Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:653)
 at java.util.ArrayList.get(ArrayList.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-10335:
-

 LLAP: IndexOutOfBound in MapJoinOperator
 

 Key: HIVE-10335
 URL: https://issues.apache.org/jira/browse/HIVE-10335
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Sergey Shelukhin
 Fix For: llap


 {code}
 2015-04-14 13:57:55,889 
 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map
  1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected 
 exception: Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:653)
 at java.util.ArrayList.get(ArrayList.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10306:
-
Attachment: HIVE-10306.4.patch

 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch, HIVE-10306.4.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10344) CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc rather than construction function

2015-04-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497137#comment-14497137
 ] 

Ashutosh Chauhan commented on HIVE-10344:
-

+1

 CBO (Calcite Return Path): Use newInstance to create ExprNodeGenericFuncDesc 
 rather than construction function
 --

 Key: HIVE-10344
 URL: https://issues.apache.org/jira/browse/HIVE-10344
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0

 Attachments: HIVE-10344.01.patch


 ExprNodeGenericFuncDesc is now created using a constructer. It skips the 
 initialization step genericUDF.initializeAndFoldConstants compared with 
 using newInstance method. If the initialization step is skipped, some 
 configuration parameters are not included in the serialization which 
 generates wrong results/errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10306:
-
Attachment: (was: HIVE-10306.4.patch)

 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10029) LLAP: Scheduling of work from different queries within the daemon

2015-04-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497279#comment-14497279
 ] 

Prasanth Jayachandran commented on HIVE-10029:
--

[~seth.siddha...@gmail.com] This should be covered by HIVE-10028 patch right?

 LLAP: Scheduling of work from different queries within the daemon
 -

 Key: HIVE-10029
 URL: https://issues.apache.org/jira/browse/HIVE-10029
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
 Fix For: llap


 The current implementation is a simple queue - whichever query wins the race 
 to submit work to a daemon will execute first.
 A policy around this may be useful - potentially a fair share, or a first 
 query in gets all slots approach.
 Also, prioritiy associated with work within a query should be considered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10349) overflow in stats

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10349:

Description: 
Discovered while running q17 in LLAP.

{noformat}
Reducer 2 
Execution mode: llap
Reduce Operator Tree:
  Merge Join Operator
condition map:
 Inner Join 0 to 1
keys:
  0 _col28 (type: int), _col27 (type: int)
  1 cs_bill_customer_sk (type: int), cs_item_sk (type: int)
outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
_col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
Statistics: Num rows: 1047651367827495040 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 _col22 (type: int)
1 d_date_sk (type: int)
  outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
_col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86
  input vertices:
1 Map 7
  Statistics: Num rows: 1152416529588199552 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: NONE

{noformat}

Data size overflows and row count also looks wrong. I wonder if this is why it 
generates 1009 reducers for this stage on 6 containers

  was:
Discovered while running q17 in LLAP.

{noformat}
Reducer 2 
Execution mode: llap
Reduce Operator Tree:
  Merge Join Operator
condition map:
 Inner Join 0 to 1
keys:
  0 _col28 (type: int), _col27 (type: int)
  1 cs_bill_customer_sk (type: int), cs_item_sk (type: int)
outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
_col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
Statistics: Num rows: 1047651367827495040 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 _col22 (type: int)
1 d_date_sk (type: int)
  outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
_col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86
  input vertices:
1 Map 7
  Statistics: Num rows: 1152416529588199552 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: NONE

{noformat}

Data size overflows and row count also looks wrong. I wonder if this is why it 
generates 1009 reducers for this stage


 overflow in stats
 -

 Key: HIVE-10349
 URL: https://issues.apache.org/jira/browse/HIVE-10349
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Prasanth Jayachandran

 Discovered while running q17 in LLAP.
 {noformat}
 Reducer 2 
 Execution mode: llap
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
  Inner Join 0 to 1
 keys:
   0 _col28 (type: int), _col27 (type: int)
   1 cs_bill_customer_sk (type: int), cs_item_sk (type: int)
 outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
 _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
 Statistics: Num rows: 1047651367827495040 Data size: 
 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
 Map Join Operator
   condition map:
Inner Join 0 to 1
   keys:
 0 _col22 (type: int)
 1 d_date_sk (type: int)
   outputColumnNames: _col1, _col2, _col6, _col8, _col9, 
 _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, 
 _col82, _col86
   input vertices:
 1 Map 7
   Statistics: Num rows: 1152416529588199552 Data size: 
 9223372036854775807 Basic stats: COMPLETE Column stats: NONE
 {noformat}
 Data size overflows and row count also looks wrong. I wonder if this is why 
 it generates 1009 reducers for this stage on 6 containers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10307) Support to use number literals in partition column

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497132#comment-14497132
 ] 

Hive QA commented on HIVE-10307:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725594/HIVE-10307.1.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8687 tests 
executed
*Failed tests:*
{noformat}
TestHBaseNegativeCliDriver - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3449/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3449/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3449/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725594 - PreCommit-HIVE-TRUNK-Build

 Support to use number literals in partition column
 --

 Key: HIVE-10307
 URL: https://issues.apache.org/jira/browse/HIVE-10307
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 1.0.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
 Attachments: HIVE-10307.1.patch, HIVE-10307.patch


 Data types like TinyInt, SmallInt, BigInt or Decimal can be expressed as 
 literals with postfix like Y, S, L, or BD appended to the number. These 
 literals work in most Hive queries, but do not when they are used as 
 partition column value. For a partitioned table like:
 create table partcoltypenum (key int, value string) partitioned by (tint 
 tinyint, sint smallint, bint bigint);
 insert into partcoltypenum partition (tint=100Y, sint=1S, 
 bint=1000L) select key, value from src limit 30;
 Queries like select, describe and drop partition do not work. For an example
 select * from partcoltypenum where tint=100Y and sint=1S and 
 bint=1000L;
 does not return any rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10350) CBO: With hive.cbo.costmodel.extended enabled IO cost is negative

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10350:
---
Description: 
Not an overflow but parallelism ends up being -1 as it uses number of buckets
{code}
 final int parallelism = RelMetadataQuery.splitCount(join) == null
  ? 1 : RelMetadataQuery.splitCount(join);
{code}


{code}
2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
(HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
(HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 rows, 
1601785.714285714 cpu, -1698787.48 io}
2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
(HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
(CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
{1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, 
id = 3000
  HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
-1.1757664816220238E9 io}, id = 2998
HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
  HiveJoin(condition=[=($1, $5)], joinType=[inner], 
joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 5.557840182539682E7 
cpu, -4299694.122023809 io}]): rowcount = 6006.726049749041, cumulative cost = 
{1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, 
id = 3132
HiveJoin(condition=[=($0, $1)], joinType=[inner], 
joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, -1.15248E9 
io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 rows, 5.9419605E7 
cpu, -1.15248E9 io}, id = 3100
  HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): rowcount 
= 5.5578005E7, cumulative cost = {0}, id = 2878
  HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost = 
{0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
rowcount = 1920800.0, cumulative cost = {0}, id = 2868
HiveJoin(condition=[=($10, $1)], joinType=[inner], 
joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 cpu, 
-8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
{1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
  HiveJoin(condition=[=($0, $4)], joinType=[inner], 
joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
{1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
  
HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
7200.0, cumulative cost = {0}, id = 2871
HiveJoin(condition=[=($3, $6)], joinType=[inner], 
joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 cpu, 
-1698787.48 io}]): rowcount = 1785.7142857142856, cumulative cost = 
{1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 io}, id = 
3105
  HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 cpu, 
0.0 io}, id = 2970
HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount = 
160.0, cumulative cost = {0}, id = 2862
  HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2974
HiveFilter(condition=[=($6, 'Hopewell')]): rowcount = 
892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2972
  HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]]): 
rowcount = 80.0, cumulative cost = {0}, id = 2864
  HiveProject(ib_income_band_sk=[$0], ib_lower_bound=[$1], 
ib_upper_bound=[$2]): rowcount = 2.2223, cumulative cost = {0.0 
rows, 0.0 cpu, 0.0 io}, id = 2988
HiveFilter(condition=[AND(=($1, 32287), =($2, +(32287, 
5)))]): rowcount = 2.2223, cumulative cost = {0.0 rows, 0.0 
cpu, 0.0 io}, id = 2986
  

[jira] [Commented] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497148#comment-14497148
 ] 

Thejas M Nair commented on HIVE-10306:
--

It might be ptest2 that expecting the file to be present. Try changing name to 
TestOperationLoggingAPITestBase.


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch, HIVE-10306.4.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10331:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9132

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons

2015-04-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497266#comment-14497266
 ] 

Prasanth Jayachandran commented on HIVE-10028:
--

[~seth.siddha...@gmail.com] Very useful comments! Fixed them all in the new 
patch. can you take a look again?

 LLAP: Create a fixed size execution queue for daemons
 -

 Key: HIVE-10028
 URL: https://issues.apache.org/jira/browse/HIVE-10028
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Prasanth Jayachandran
 Fix For: llap

 Attachments: HIVE-10028.1.patch, HIVE-10028.2.patch


 Currently, this is unbounded. This should be a configurable size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10284) enable container reuse for grace hash join

2015-04-15 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10284:
-
Attachment: HIVE-10284.8.patch

Upload patch 8 for testing

 enable container reuse for grace hash join 
 ---

 Key: HIVE-10284
 URL: https://issues.apache.org/jira/browse/HIVE-10284
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Wei Zheng
 Attachments: HIVE-10284.1.patch, HIVE-10284.2.patch, 
 HIVE-10284.3.patch, HIVE-10284.4.patch, HIVE-10284.5.patch, 
 HIVE-10284.6.patch, HIVE-10284.7.patch, HIVE-10284.8.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8306) Map join sizing done by auto.convert.join.noconditionaltask.size doesn't take into account Hash table overhead and results in OOM

2015-04-15 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar resolved HIVE-8306.
---
Resolution: Fixed

Resolving since there is the Hybrid Hybrid grace hash table which should handle 
under estimates gracefully

 Map join sizing done by auto.convert.join.noconditionaltask.size doesn't take 
 into account Hash table overhead and results in OOM
 -

 Key: HIVE-8306
 URL: https://issues.apache.org/jira/browse/HIVE-8306
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth Jayachandran
Priority: Minor
 Attachments: query64_oom_trim.txt


 When hive.auto.convert.join.noconditionaltask = true we check 
 noconditionaltask.size and if the sum of tables sizes in the map join is less 
 than  noconditionaltask.size the plan would generate a Map join, the issue 
 with this is that the calculation doesn't take into account the overhead 
 introduced by different HashTable implementation as results if the sum of 
 input sizes is smaller than the noconditionaltask size by a small margin 
 queries will hit OOM.
 TPC-DS query 64 is a good example for this issue as one as non conditional 
 task size is set to 1,280,000,000 while sum of input is 1,012,379,321 which 
 is 20% smaller than the expected size.
 
 Vertex
 {code}
Map 28 - Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Map 14 
 (BROADCAST_EDGE), Map 15 (BROADCAST_EDGE), Map 16 (BROADCAST_EDGE), Map 24 
 (BROADCAST_EDGE), Map 26 (BROADCAST_EDGE), Map 30 (BROADCAST_EDGE), Map 31 
 (BROADCAST_EDGE), Map 32 (BROADCAST_EDGE), Map 39 (BROADCAST_EDGE), Map 40 
 (BROADCAST_EDGE), Map 43 (BROADCAST_EDGE), Map 45 (BROADCAST_EDGE), Map 5 
 (BROADCAST_EDGE)
 {code}
 Exception
 {code}
 , TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:169)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.OutOfMemoryError: Java heap space
   at 
 org.apache.hadoop.hive.serde2.WriteBuffers.nextBufferToWrite(WriteBuffers.java:206)
   at 
 org.apache.hadoop.hive.serde2.WriteBuffers.write(WriteBuffers.java:182)
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$LazyBinaryKvWriter.writeKey(MapJoinBytesTableContainer.java:189)
   at 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.put(BytesBytesMultiHashMap.java:200)
   at 
 org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer.putRow(MapJoinBytesTableContainer.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:114)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:184)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:210)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1036)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1040)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:37)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.processRow(MapRecordProcessor.java:186)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:164)
   at 
 

[jira] [Commented] (HIVE-10343) CBO (Calcite Return Path): Parameterize algorithm cost model

2015-04-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497467#comment-14497467
 ] 

Lefty Leverenz commented on HIVE-10343:
---

Doc note:  I added two labels (TODOC-CBO and TODOC1.2) because commit r1673948 
went to the cbo branch but this jira is marked with Fix Version 1.2.0.

The patch adds seven configuration parameters to HiveConf.java, so they need to 
be documented in the wiki for release 1.2.0 or whenever the cbo branch gets 
merged to trunk.  Another parameter is removed (*hive.cbo.costmodel.extended* 
which came from HIVE-10040).

* hive.cbo.costmodel.extended
* hive.cbo.costmodel.cpu
* hive.cbo.costmodel.network
* hive.cbo.costmodel.local.fs.write
* hive.cbo.costmodel.local.fs.read
* hive.cbo.costmodel.hdfs.write
* hive.cbo.costmodel.hdfs.read


 CBO (Calcite Return Path): Parameterize algorithm cost model
 

 Key: HIVE-10343
 URL: https://issues.apache.org/jira/browse/HIVE-10343
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
  Labels: TODOC-CBO, TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10343.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10349) overflow in stats

2015-04-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497412#comment-14497412
 ] 

Sergey Shelukhin commented on HIVE-10349:
-

[~hagleitn] [~prasanth_j] [~mmokhtar] [~gopalv] many queries in TPCDS suffer 
from the problem where there are 1000s of reducers (sometimes, 3 stages of 
600-700 reducers each); this is running on 6 nodes with ~6 slots each. Not sure 
if this is caused just by the stats problem or there are other problems with 
physical optimizer, but it seems like a big perf issue that we should address 
soon. 

 overflow in stats
 -

 Key: HIVE-10349
 URL: https://issues.apache.org/jira/browse/HIVE-10349
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Prasanth Jayachandran

 Discovered while running q17 in LLAP.
 {noformat}
 Reducer 2 
 Execution mode: llap
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
  Inner Join 0 to 1
 keys:
   0 _col28 (type: int), _col27 (type: int)
   1 cs_bill_customer_sk (type: int), cs_item_sk (type: int)
 outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
 _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
 Statistics: Num rows: 1047651367827495040 Data size: 
 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
 Map Join Operator
   condition map:
Inner Join 0 to 1
   keys:
 0 _col22 (type: int)
 1 d_date_sk (type: int)
   outputColumnNames: _col1, _col2, _col6, _col8, _col9, 
 _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, 
 _col82, _col86
   input vertices:
 1 Map 7
   Statistics: Num rows: 1152416529588199552 Data size: 
 9223372036854775807 Basic stats: COMPLETE Column stats: NONE
 {noformat}
 Data size overflows and row count also looks wrong. I wonder if this is why 
 it generates 1009 reducers for this stage on 6 machines



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497423#comment-14497423
 ] 

Hive QA commented on HIVE-10319:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725625/HIVE-10319.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3452/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3452/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3452/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3452/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'metastore/scripts/upgrade/derby/hive-schema-1.2.0.derby.sql'
Reverted 'metastore/scripts/upgrade/derby/upgrade-1.1.0-to-1.2.0.derby.sql'
Reverted 'metastore/scripts/upgrade/oracle/hive-schema-1.2.0.oracle.sql'
Reverted 'metastore/scripts/upgrade/oracle/upgrade-1.1.0-to-1.2.0.oracle.sql'
Reverted 
'metastore/scripts/upgrade/postgres/upgrade-1.1.0-to-1.2.0.postgres.sql'
Reverted 'metastore/scripts/upgrade/postgres/hive-schema-1.2.0.postgres.sql'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target 
shims/0.23/target shims/aggregator/target shims/common/target 
shims/scheduler/target packaging/target hbase-handler/target testutils/target 
testutils/metastore/dbs/derby testutils/metastore/dbs/oracle 
testutils/metastore/dbs/postgres jdbc/target metastore/target 
metastore/scripts/upgrade/derby/022-HIVE-10239.derby.sql 
metastore/scripts/upgrade/oracle/022-HIVE-10239.oracle.sql 
metastore/scripts/upgrade/postgres/022-HIVE-10239.postgres.sql itests/target 
itests/thirdparty itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target 
itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target itests/qtest-spark/target hcatalog/target 
hcatalog/core/target hcatalog/streaming/target 
hcatalog/server-extensions/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target 
accumulo-handler/target hwi/target common/target common/src/gen 
spark-client/target service/target contrib/target serde/target beeline/target 
odbc/target cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update
Uql/src/test/results/clientnegative/udf_next_day_error_1.q.out
Uql/src/test/results/clientnegative/udf_add_months_error_1.q.out
Uql/src/test/results/clientnegative/udf_next_day_error_2.q.out
Uql/src/test/results/clientnegative/udf_last_day_error_1.q.out
Uql/src/test/results/clientpositive/spark/vector_elt.q.out
Uql/src/test/results/clientpositive/spark/load_dyn_part14.q.out
Uql/src/test/results/clientpositive/spark/join8.q.out
Uql/src/test/results/clientpositive/spark/optimize_nullscan.q.out
Uql/src/test/results/clientpositive/spark/auto_join8.q.out
Uql/src/test/results/clientpositive/annotate_stats_select.q.out
Uql/src/test/results/clientpositive/udf4.q.out
Uql/src/test/results/clientpositive/udf_isnull_isnotnull.q.out
Uql/src/test/results/clientpositive/decimal_udf.q.out
Uql/src/test/results/clientpositive/udf_hour.q.out
Uql/src/test/results/clientpositive/udf_if.q.out
Uql/src/test/results/clientpositive/input8.q.out
U

[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497422#comment-14497422
 ] 

Hive QA commented on HIVE-10239:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725621/HIVE-10239.0.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8690 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3451/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3451/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3451/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725621 - PreCommit-HIVE-TRUNK-Build

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9015) Constant Folding optimizer doesn't handle expressions involving null

2015-04-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9015:
---
Component/s: Logical Optimizer

 Constant Folding optimizer doesn't handle expressions involving null
 

 Key: HIVE-9015
 URL: https://issues.apache.org/jira/browse/HIVE-9015
 Project: Hive
  Issue Type: Task
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.0


 Expressions which are guaranteed to evaluate to {{null}} aren't folded by 
 optimizer yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Cache small tables in memory [Spark Branch]

2015-04-15 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10302:
---
Attachment: HIVE-10302.spark-1.patch

 Cache small tables in memory [Spark Branch]
 ---

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.spark-1.patch


 If we can cache small tables in executor memory, we could save some time in 
 loading them from HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-15 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497393#comment-14497393
 ] 

Aihua Xu commented on HIVE-9917:


Somehow vector_between_in test case baselines were not updated. Upload with new 
patch to fix the test cases.

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-9917:
---
Attachment: HIVE-9917.patch

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-15 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-9917:
---
Attachment: (was: HIVE-9917.patch)

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10350) CBO: Use total size instead of bucket count to determine number of splits parallelism

2015-04-15 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10350:
--
Attachment: HIVE-10350.2.patch

 CBO: Use total size instead of bucket count to determine number of splits  
 parallelism 
 

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10350.2.patch


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount = 
 892.8571428571428, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 

[jira] [Commented] (HIVE-10350) CBO: Use total size instead of bucket count to determine number of splits parallelism

2015-04-15 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497444#comment-14497444
 ] 

Laljo John Pullokkaran commented on HIVE-10350:
---

[~mmokhtar] I have uploaded a refined patch. Try it out.

 CBO: Use total size instead of bucket count to determine number of splits  
 parallelism 
 

 Key: HIVE-10350
 URL: https://issues.apache.org/jira/browse/HIVE-10350
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10350.2.patch


 Not an overflow but parallelism ends up being -1 as it uses number of buckets
 {code}
  final int parallelism = RelMetadataQuery.splitCount(join) == null
   ? 1 : RelMetadataQuery.splitCount(join);
 {code}
 {code}
 2015-04-13 18:19:09,154 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - COMMON_JOIN cost: {1600892.857142857 
 rows, 2.4463782008994658E7 cpu, 8.54445445875E10 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(62)) - MAP_JOIN cost: {1600892.857142857 
 rows, 1601785.714285714 cpu, -1698787.48 io}
 2015-04-13 18:19:09,155 DEBUG [main]: cost.HiveCostModel 
 (HiveCostModel.java:getJoinCost(72)) - MAP_JOIN selected
 2015-04-13 18:19:09,157 DEBUG [main]: parse.CalcitePlanner 
 (CalcitePlanner.java:apply(862)) - Plan After Join Reordering:
 HiveSort(fetch=[100]): rowcount = 6006.726049749041, cumulative cost = 
 {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, -1.1757664816220238E9 
 io}, id = 3000
   HiveSort(sort0=[$0], dir0=[ASC]): rowcount = 6006.726049749041, cumulative 
 cost = {1.1468867492063493E8 rows, 1.166177684126984E8 cpu, 
 -1.1757664816220238E9 io}, id = 2998
 HiveProject(customer_id=[$4], customername=[concat($9, ', ', $8)]): 
 rowcount = 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3136
   HiveJoin(condition=[=($1, $5)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.557820341269841E7 rows, 
 5.557840182539682E7 cpu, -4299694.122023809 io}]): rowcount = 
 6006.726049749041, cumulative cost = {1.1468867492063493E8 rows, 
 1.166177684126984E8 cpu, -1.1757664816220238E9 io}, id = 3132
 HiveJoin(condition=[=($0, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{5.7498805E7 rows, 5.9419605E7 cpu, 
 -1.15248E9 io}]): rowcount = 5.5578005E7, cumulative cost = {5.7498805E7 
 rows, 5.9419605E7 cpu, -1.15248E9 io}, id = 3100
   HiveProject(sr_cdemo_sk=[$4]): rowcount = 5.5578005E7, cumulative 
 cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2992
 HiveTableScan(table=[[tpcds_bin_orc_200.store_returns]]): 
 rowcount = 5.5578005E7, cumulative cost = {0}, id = 2878
   HiveProject(cd_demo_sk=[$0]): rowcount = 1920800.0, cumulative cost 
 = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2978
 HiveTableScan(table=[[tpcds_bin_orc_200.customer_demographics]]): 
 rowcount = 1920800.0, cumulative cost = {0}, id = 2868
 HiveJoin(condition=[=($10, $1)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1787.9365079365077 rows, 1790.15873015873 
 cpu, -8000.0 io}]): rowcount = 198.4126984126984, cumulative cost = 
 {1611666.507936508 rows, 1619761.5873015872 cpu, -1.89867875E7 io}, id = 3130
   HiveJoin(condition=[=($0, $4)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{8985.714285714286 rows, 16185.714285714286 
 cpu, -1.728E7 io}]): rowcount = 1785.7142857142856, cumulative cost = 
 {1609878.5714285714 rows, 1617971.4285714284 cpu, -1.89787875E7 io}, id = 3128
 HiveProject(hd_demo_sk=[$0], hd_income_band_sk=[$1]): rowcount = 
 7200.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2982
   
 HiveTableScan(table=[[tpcds_bin_orc_200.household_demographics]]): rowcount = 
 7200.0, cumulative cost = {0}, id = 2871
 HiveJoin(condition=[=($3, $6)], joinType=[inner], 
 joinAlgorithm=[map_join], cost=[{1600892.857142857 rows, 1601785.714285714 
 cpu, -1698787.48 io}]): rowcount = 1785.7142857142856, cumulative 
 cost = {1600892.857142857 rows, 1601785.714285714 cpu, -1698787.48 
 io}, id = 3105
   HiveProject(c_customer_id=[$1], c_current_cdemo_sk=[$2], 
 c_current_hdemo_sk=[$3], c_current_addr_sk=[$4], c_first_name=[$8], 
 c_last_name=[$9]): rowcount = 160.0, cumulative cost = {0.0 rows, 0.0 
 cpu, 0.0 io}, id = 2970
 HiveTableScan(table=[[tpcds_bin_orc_200.customer]]): rowcount 
 = 160.0, cumulative cost = {0}, id = 2862
   HiveProject(ca_address_sk=[$0], ca_city=[$6]): rowcount 

[jira] [Commented] (HIVE-10356) LLAP: query80 fails with vectorization cast issue

2015-04-15 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497442#comment-14497442
 ] 

Matt McCline commented on HIVE-10356:
-

Looks like HIVE-10244.

 LLAP: query80 fails with vectorization cast issue 
 --

 Key: HIVE-10356
 URL: https://issues.apache.org/jira/browse/HIVE-10356
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Matt McCline

 Reducer 6 fails:
 {noformat}
 Error: Failure while running task:java.lang.RuntimeException: 
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing vector batch (tag=0) 
 \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8
 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8
 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0) 
 \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8
 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8
 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
   ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing vector batch (tag=0) 
 \N\N09.285817653506076E84.639990363237801E7-1.1814318134524737E8
 \N\N01.2847032699693155E96.41569738480791E7-5.956161019898126E8
 \N\N04.682909323885761E82.288924051203157E7-5.995957665973593E7
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:394)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:252)
   ... 16 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupKeyHelper.copyGroupKey(VectorGroupKeyHelper.java:94)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeGroupBatches.processBatch(VectorGroupByOperator.java:729)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:878)
   at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:378)
   ... 17 more
 ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
 vertex_1428572510173_0231_1_24 [Reducer 5] killed/failed due to:null]Vertex 
 killed, vertexName=Reducer 6, vertexId=vertex_1428572510173_0231_1_25, 
 diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as 
 other vertex failed. failedTasks:0, Vertex vertex_1428572510173_0231_1_25 
 [Reducer 6] killed/failed due to:null]DAG failed due to vertex failure. 
 failedVertices:1 killedVertices:1
 {noformat}
 How to repro: run query80 on scale factor 200. I might look tomorrow to see 
 if this is specific to LLAP or not



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key

2015-04-15 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10324:

Attachment: HIVE-10324.1.patch

Thanks [~szehon] for your review. Update patch addressing backwards 
compatibility

 Hive metatool should take table_param_key to allow for changes to avro 
 serde's schema url key
 -

 Key: HIVE-10324
 URL: https://issues.apache.org/jira/browse/HIVE-10324
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Ferdinand Xu
 Attachments: HIVE-10324.1.patch, HIVE-10324.patch, 
 HIVE-10324.patch.WIP


 HIVE-3443 added support to change the serdeParams from 'metatool 
 updateLocation' command.
 However, in avro it is possible to specify the schema via the tableParams:
 {noformat}
 CREATE  TABLE `testavro`(
   `test` string COMMENT 'from deserializer')
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 TBLPROPERTIES (
   'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 
   'kite.compression.type'='snappy', 
   'transient_lastDdlTime'='1427996456')
 {noformat}
 Hence for those tables the 'metatool updateLocation' will not help.
 This is necessary in case like upgrade the namenode to HA where the absolute 
 paths have changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9252) Linking custom SerDe jar to table definition.

2015-04-15 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497585#comment-14497585
 ] 

Ferdinand Xu commented on HIVE-9252:


Sorry, I don't have circles working on this jira currently. It's on my TODO 
list. I will work on it soon. Thank you!

 Linking custom SerDe jar to table definition.
 -

 Key: HIVE-9252
 URL: https://issues.apache.org/jira/browse/HIVE-9252
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Niels Basjes
Assignee: Ferdinand Xu
 Attachments: HIVE-9252.1.patch


 In HIVE-6047 the option was created that a jar file can be hooked to the 
 definition of a function. (See: [Language Manual DDL: Permanent 
 Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions]
  )
 I propose to add something similar that can be used when defining an external 
 table that relies on a custom Serde (I expect to usually only have the 
 Deserializer).
 Something like this:
 {code}
 CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
 ...
 STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] 
 [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
 {code}
 Using this you can define (and share !!!) a Hive table on top of a custom 
 fileformat without the need to let the IT operations people deploy a custom 
 SerDe jar file on all nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10302) Cache small tables in memory [Spark Branch]

2015-04-15 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497390#comment-14497390
 ] 

Jimmy Xiang commented on HIVE-10302:


The patch is on RB: https://reviews.apache.org/r/33251/

 Cache small tables in memory [Spark Branch]
 ---

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.spark-1.patch


 If we can cache small tables in executor memory, we could save some time in 
 loading them from HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9015) Constant Folding optimizer doesn't handle expressions involving null

2015-04-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-9015.

   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed via HIVE-9645

 Constant Folding optimizer doesn't handle expressions involving null
 

 Key: HIVE-9015
 URL: https://issues.apache.org/jira/browse/HIVE-9015
 Project: Hive
  Issue Type: Task
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.0


 Expressions which are guaranteed to evaluate to {{null}} aren't folded by 
 optimizer yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10270) Cannot use Decimal constants less than 0.1BD

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497481#comment-14497481
 ] 

Hive QA commented on HIVE-10270:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725676/HIVE-10270.5.patch

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8691 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3454/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3454/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3454/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725676 - PreCommit-HIVE-TRUNK-Build

 Cannot use Decimal constants less than 0.1BD
 

 Key: HIVE-10270
 URL: https://issues.apache.org/jira/browse/HIVE-10270
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10270.1.patch, HIVE-10270.2.patch, 
 HIVE-10270.3.patch, HIVE-10270.4.patch, HIVE-10270.5.patch


 {noformat}
 hive select 0.09765625BD;
 FAILED: IllegalArgumentException Decimal scale must be less than or equal to 
 precision
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key

2015-04-15 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497557#comment-14497557
 ] 

Szehon Ho commented on HIVE-10324:
--

Thanks! +1

 Hive metatool should take table_param_key to allow for changes to avro 
 serde's schema url key
 -

 Key: HIVE-10324
 URL: https://issues.apache.org/jira/browse/HIVE-10324
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Ferdinand Xu
 Attachments: HIVE-10324.1.patch, HIVE-10324.patch, 
 HIVE-10324.patch.WIP


 HIVE-3443 added support to change the serdeParams from 'metatool 
 updateLocation' command.
 However, in avro it is possible to specify the schema via the tableParams:
 {noformat}
 CREATE  TABLE `testavro`(
   `test` string COMMENT 'from deserializer')
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.serde2.avro.AvroSerDe' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 TBLPROPERTIES (
   'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 
   'kite.compression.type'='snappy', 
   'transient_lastDdlTime'='1427996456')
 {noformat}
 Hence for those tables the 'metatool updateLocation' will not help.
 This is necessary in case like upgrade the namenode to HA where the absolute 
 paths have changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10306:
-
Attachment: HIVE-10306.4.patch

 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch, HIVE-10306.4.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10306) We need to print tez summary when hive.server2.logging.level = PERFORMANCE.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10306:
-
Attachment: (was: HIVE-10306.4.patch)

 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 -

 Key: HIVE-10306
 URL: https://issues.apache.org/jira/browse/HIVE-10306
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10306.1.patch, HIVE-10306.2.patch, 
 HIVE-10306.3.patch


 We need to print tez summary when hive.server2.logging.level = PERFORMANCE. 
 We introduced this parameter via HIVE-10119.
 The logging param for levels is only relevant to HS2, so for hive-cli users 
 the hive.tez.exec.print.summary still makes sense. We can check for log-level 
 param as well, in places we are checking value of 
 hive.tez.exec.print.summary. Ie, consider hive.tez.exec.print.summary=true if 
 log.level = PERFORMANCE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10040) CBO (Calcite Return Path): Pluggable cost modules [CBO branch]

2015-04-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10040:
--
Labels:   (was: TODOC-CBO)

 CBO (Calcite Return Path): Pluggable cost modules [CBO branch]
 --

 Key: HIVE-10040
 URL: https://issues.apache.org/jira/browse/HIVE-10040
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10040.01.cbo.patch, HIVE-10040.02.cbo.patch, 
 HIVE-10040.03.cbo.patch, HIVE-10040.cbo.patch


 We should be able to deal with cost models in a modular way. Thus, the cost 
 model should be integrated within a Calcite MD provider that is pluggable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10040) CBO (Calcite Return Path): Pluggable cost modules [CBO branch]

2015-04-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497469#comment-14497469
 ] 

Lefty Leverenz commented on HIVE-10040:
---

No doc needed:  HIVE-10343 removed *hive.cbo.costmodel.extended* so I'm 
removing the TODOC-CBO label.

 CBO (Calcite Return Path): Pluggable cost modules [CBO branch]
 --

 Key: HIVE-10040
 URL: https://issues.apache.org/jira/browse/HIVE-10040
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Affects Versions: cbo-branch
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: cbo-branch

 Attachments: HIVE-10040.01.cbo.patch, HIVE-10040.02.cbo.patch, 
 HIVE-10040.03.cbo.patch, HIVE-10040.cbo.patch


 We should be able to deal with cost models in a modular way. Thus, the cost 
 model should be integrated within a Calcite MD provider that is pluggable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10346) Tez on HBase has problems with settings again

2015-04-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497533#comment-14497533
 ] 

Hive QA commented on HIVE-10346:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725700/HIVE-10346.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8690 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3455/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3455/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3455/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725700 - PreCommit-HIVE-TRUNK-Build

 Tez on HBase has problems with settings again
 -

 Key: HIVE-10346
 URL: https://issues.apache.org/jira/browse/HIVE-10346
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10346.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496983#comment-14496983
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-9710:
-

[~vgumashta] Thanks for reviewing the change. I am adding the follow-up jira 
HIVE-10345 to cover the test case addressed by you.

Thanks
Hari

 HiveServer2 should support cookie based authentication, when using HTTP 
 transport.
 --

 Key: HIVE-9710
 URL: https://issues.apache.org/jira/browse/HIVE-9710
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 1.2.0
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, 
 HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch, 
 HIVE-9710.8.patch


 HiveServer2 should generate cookies and validate the client cookie send to it 
 so that it need not perform User/Password or a Kerberos based authentication 
 on each HTTP request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10345) Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10345:
-
Description: 
We need to add test cases to cover these scenarios.
_
Client |  HS2 Cookie   |  Expected Behavior
SSL   |  Secured |  Client replays, server validates the cookie.
SSL   |   Unsecured|   Client replays, server validates the cookie.
No SSL |   UnSecured|  Client replays, server validates the cookie.
No SSL |  Secured  |  Client should send back credentials since 
cookie
 |  |  replay will not be transmitted 
back to the server.

  was:
We need to add test cases to cover these scenarios.
_
Client |  HS2 Cookie   |  Expected Behavior
___| _ |___
SSL   |  Secured |  Client replays, server validates the cookie.
SSL   |   Unsecured|   Client replays, server validates the cookie.
No SSL |   UnSecured|  Client replays, server validates the cookie.
No SSL |  Secured  |  Client should send back credentials since 
cookie
 |  |  replay will not be transmitted 
back to the server.


 Add test case to ensure client sends credentials in non-ssl mode when HS2 
 sends a secure cookie
 ---

 Key: HIVE-10345
 URL: https://issues.apache.org/jira/browse/HIVE-10345
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 We need to add test cases to cover these scenarios.
 _
 Client |  HS2 Cookie   |  Expected Behavior
 SSL   |  Secured |  Client replays, server validates the 
 cookie.
 SSL   |   Unsecured|   Client replays, server validates the 
 cookie.
 No SSL |   UnSecured|  Client replays, server validates the cookie.
 No SSL |  Secured  |  Client should send back credentials since 
 cookie
  |  |  replay will not be transmitted 
 back to the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10345) Add test case to ensure client sends credentials in non-ssl mode when HS2 sends a secure cookie

2015-04-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10345:
-
Description: 
We need to add test cases to cover these scenarios.
_
Client |  HS2 Cookie   |  Expected Behavior
SSL   |  Secured |  Client replays, server validates the cookie.
SSL   |   Unsecured|   Client replays, server validates the cookie.
No SSL |   UnSecured|  Client replays, server validates the cookie.
No SSL |  Secured  |  Client should send back credentials since 
cookie  replay will not be transmitted back to the server.


  was:
We need to add test cases to cover these scenarios.
_
Client |  HS2 Cookie   |  Expected Behavior
SSL   |  Secured |  Client replays, server validates the cookie.
SSL   |   Unsecured|   Client replays, server validates the cookie.
No SSL |   UnSecured|  Client replays, server validates the cookie.
No SSL |  Secured  |  Client should send back credentials since 
cookie
 |  |  replay will not be transmitted 
back to the server.


 Add test case to ensure client sends credentials in non-ssl mode when HS2 
 sends a secure cookie
 ---

 Key: HIVE-10345
 URL: https://issues.apache.org/jira/browse/HIVE-10345
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 We need to add test cases to cover these scenarios.
 _
 Client |  HS2 Cookie   |  Expected Behavior
 SSL   |  Secured |  Client replays, server validates the 
 cookie.
 SSL   |   Unsecured|   Client replays, server validates the 
 cookie.
 No SSL |   UnSecured|  Client replays, server validates the cookie.
 No SSL |  Secured  |  Client should send back credentials since 
 cookie  replay will not be transmitted back to the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns

2015-04-15 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497008#comment-14497008
 ] 

Jason Dere commented on HIVE-9580:
--

I think this looks fine. I would just say to make sure there are tests to cover 
the types that would get affected by this change (char/varchar/decimal joins), 
which it looks like there already are.

 Server returns incorrect result from JOIN ON VARCHAR columns
 

 Key: HIVE-9580
 URL: https://issues.apache.org/jira/browse/HIVE-9580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Mike
Assignee: Aihua Xu
 Attachments: HIVE-9580.patch


 The database erroneously returns rows when joining two tables which each 
 contain a VARCHAR column and the join's ON condition uses the equality 
 operator on the VARCHAR columns.
 **The following JDBC method exhibits the problem:
   static void joinIssue() 
   throws SQLException {
   
   String sql;
   int rowsAffected;
   ResultSet rs;
   Statement stmt = con.createStatement();
   String table1_Name = blahtab1;
   String table1A_Name = blahtab1A;
   String table1B_Name = blahtab1B;
   String table2_Name = blahtab2;
   
   try {
   sql = drop table  + table1_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1_Name +  values ('jklmnopqrs', 
 99);
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1A_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1A_Name + ( +
   VCHARCOL VARCHAR(10)  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   sql = insert into  + table1A_Name +  values ('jklmnopqrs');
   System.out.println(\nsql= + sql);
   stmt.executeUpdate(sql);
   
 System.out.println(===);
   
   try {
   sql = drop table  + table1B_Name;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(Drop table error: + se.getMessage());
   }
   try {
   sql = CREATE TABLE  + table1B_Name + ( +
   VCHARCOL VARCHAR(11)  +
   ,INTEGERCOL INT  +
   ) 
   ;
   System.out.println(\nsql= + sql);
   rowsAffected = stmt.executeUpdate(sql);
   }
   catch (SQLException se) {
   println(create table error: + se.getMessage());
   }
   
   

[jira] [Assigned] (HIVE-10335) LLAP: IndexOutOfBound in MapJoinOperator

2015-04-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10335:
---

Assignee: Sergey Shelukhin

 LLAP: IndexOutOfBound in MapJoinOperator
 

 Key: HIVE-10335
 URL: https://issues.apache.org/jira/browse/HIVE-10335
 Project: Hive
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Sergey Shelukhin
 Fix For: llap


 {code}
 2015-04-14 13:57:55,889 
 [TezTaskRunner_attempt_1428572510173_0173_2_03_14_0(container_1_0173_01_66_sseth_20150414135750_7a7c2f4f-5f2d-4645-b833-677621f087bd:2_Map
  1_14_0)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected 
 exception: Index: 0, Size: 0
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:653)
 at java.util.ArrayList.get(ArrayList.java:429)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
 at 
 org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:386)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:283)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:232)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:240)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:348)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:332)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >