[jira] [Commented] (HIVE-3159) Update AvroSerde to determine schema of new tables

2013-12-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852724#comment-13852724
 ] 

Hive QA commented on HIVE-3159:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619455/HIVE-3159.6.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4805 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/697/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/697/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619455

 Update AvroSerde to determine schema of new tables
 --

 Key: HIVE-3159
 URL: https://issues.apache.org/jira/browse/HIVE-3159
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Jakob Homan
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, 
 HIVE-3159v1.patch


 Currently when writing tables to Avro one must manually provide an Avro 
 schema that matches what is being delivered by Hive. It'd be better to have 
 the serde infer this schema by converting the table's TypeInfo into an 
 appropriate AvroSchema.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table

2013-12-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852755#comment-13852755
 ] 

Hive QA commented on HIVE-5795:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619457/HIVE-5795.3.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 4796 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_file_with_header_footer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_file_with_header_footer_negative
org.apache.hive.service.auth.TestCustomAuthentication.testCustomAuthentication
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/698/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/698/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619457

 Hive should be able to skip header and footer rows when reading data file for 
 a table
 -

 Key: HIVE-5795
 URL: https://issues.apache.org/jira/browse/HIVE-5795
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch


 Hive should be able to skip header and footer lines when reading data file 
 from table. In this way, user don't need to processing data which generated 
 by other application with a header or footer and directly use the file for 
 table operations.
 To implement this, the idea is adding new properties in table descriptions to 
 define the number of lines in header and footer and skip them when reading 
 the record from record reader. An DDL example for creating a table with 
 header and footer should be like this:
 {code}
 Create external table testtable (name string, message string) row format 
 delimited fields terminated by '\t' lines terminated by '\n' location 
 '/testtable' tblproperties (skip.header.number=1, 
 skip.footer.number=2);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF

2013-12-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852777#comment-13852777
 ] 

Hive QA commented on HIVE-5829:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619469/HIVE-5829.3.patch

{color:green}SUCCESS:{color} +1 4799 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/699/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/699/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619469

 Rewrite Trim and Pad UDFs based on GenericUDF
 -

 Key: HIVE-5829
 URL: https://issues.apache.org/jira/browse/HIVE-5829
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, HIVE-5829.3.patch, 
 tmp.HIVE-5829.patch


 This JIRA includes following UDFs:
 1. trim()
 2. ltrim()
 3. rtrim()
 4. lpad()
 5. rpad()



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6057) Enable bucketed sorted merge joins of arbitrary subqueries

2013-12-19 Thread Jan-Erik Hedbom (JIRA)
Jan-Erik Hedbom created HIVE-6057:
-

 Summary: Enable bucketed sorted merge joins of arbitrary subqueries
 Key: HIVE-6057
 URL: https://issues.apache.org/jira/browse/HIVE-6057
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Jan-Erik Hedbom
Priority: Minor


Currently, you cannot use bucketed SMJ when joining subquery results. It would 
make sense to be able to explicitly specify bucketing of the intermediate 
output from a subquery to enable bucketed SMJ.

For example, the following query will NOT use bucketed SMJ:
(gameends and dummymapping are clustered and sorted by hashid into 128 buckets)

select * from (select hashid,count(*) as c from gameends group by hashid 
distribute by hashid sort by hashid) e join dummymapping m on e.hashid=m.hashid

Suggestion: Implement an INTO n BUCKETS syntax for subqueries to enable 
bucketed SMJ:
select * from (select hashid,count(*) as c from gameends group by hashid 
distribute by hashid sort by hashid INTO 128 BUCKETS) e join dummymapping m on 
e.hashid=m.hashid



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6048) Hive load data command rejects file with '+' in the name

2013-12-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852809#comment-13852809
 ] 

Hive QA commented on HIVE-6048:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619491/HIVE-6048.1.patch

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 4794 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testsequencefile
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/700/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/700/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619491

 Hive load data command rejects file with '+' in the name
 

 Key: HIVE-6048
 URL: https://issues.apache.org/jira/browse/HIVE-6048
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6048.1.patch, HIVE-6048.patch


 '+' is a valid character in a file name on linux and HDFS. However, loading 
 data from such a file into table results the following error:
 {code}
 hive load data local inpath '/home/xzhang/temp/t+est.txt' into table test;
 Copying data from file:/home/xzhang/temp/t est.txt
 No files matching path: file:/home/xzhang/temp/t est.txt
 FAILED: Execution Error, return code 3 from 
 org.apache.hadoop.hive.ql.exec.CopyTask
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification

2013-12-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852813#comment-13852813
 ] 

Mikhail Antonov commented on HIVE-3384:
---

It looks like 0.11 doesn't compile on open jdk 7. I'm trying to compile it and 
getting the following same errors in jdbc module:

Am I doing something wrong? I'm just running ant clean package test



 Compiling 28 source files to /projects/apache/hive/build/jdbc/classes
[javac] 
/projects/apache/hive/jdbc/src/java/org/apache/hive/jdbc/HiveCallableStatement.java:48:
 error: HiveCallableStatement is not abstract and does not override abstract 
method TgetObject(String,ClassT) in CallableStatement
[javac] public class HiveCallableStatement implements 
java.sql.CallableStatement {
[javac]^
[javac]   where T is a type-variable:
[javac] T extends Object declared in method 
TgetObject(String,ClassT)
[javac] 
/projects/apache/hive/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java:65:
 error: HiveConnection is not abstract and does not override abstract method 
getNetworkTimeout() in Connection
[javac] public class HiveConnection implements java.sql.Connection {




 HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC 
 specification
 --

 Key: HIVE-3384
 URL: https://issues.apache.org/jira/browse/HIVE-3384
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.10.0
Reporter: Weidong Bian
Assignee: Chris Drome
Priority: Minor
 Fix For: 0.11.0

 Attachments: D6873-0.9.1.patch, D6873.1.patch, D6873.2.patch, 
 D6873.3.patch, D6873.4.patch, D6873.5.patch, D6873.6.patch, D6873.7.patch, 
 HIVE-3384-0.10.patch, HIVE-3384-2012-12-02.patch, HIVE-3384-2012-12-04.patch, 
 HIVE-3384-branch-0.9.patch, HIVE-3384.2.patch, HIVE-3384.patch, 
 HIVE-JDK7-JDBC.patch


 jdbc module couldn't be compiled with jdk7 as it adds some abstract method in 
 the JDBC specification 
 some error info:
  error: HiveCallableStatement is not abstract and does not override abstract
 method TgetObject(String,ClassT) in CallableStatement
 .
 .
 .



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification

2013-12-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852816#comment-13852816
 ] 

Mikhail Antonov commented on HIVE-3384:
---

Oh, I guess it's now HIVE-4496. Sorry.

 HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC 
 specification
 --

 Key: HIVE-3384
 URL: https://issues.apache.org/jira/browse/HIVE-3384
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.10.0
Reporter: Weidong Bian
Assignee: Chris Drome
Priority: Minor
 Fix For: 0.11.0

 Attachments: D6873-0.9.1.patch, D6873.1.patch, D6873.2.patch, 
 D6873.3.patch, D6873.4.patch, D6873.5.patch, D6873.6.patch, D6873.7.patch, 
 HIVE-3384-0.10.patch, HIVE-3384-2012-12-02.patch, HIVE-3384-2012-12-04.patch, 
 HIVE-3384-branch-0.9.patch, HIVE-3384.2.patch, HIVE-3384.patch, 
 HIVE-JDK7-JDBC.patch


 jdbc module couldn't be compiled with jdk7 as it adds some abstract method in 
 the JDBC specification 
 some error info:
  error: HiveCallableStatement is not abstract and does not override abstract
 method TgetObject(String,ClassT) in CallableStatement
 .
 .
 .



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-4496) JDBC2 won't compile with JDK7

2013-12-19 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852817#comment-13852817
 ] 

Mikhail Antonov commented on HIVE-4496:
---

If it affects 0.11 (compilation for jdk 7), are there any plans/possibilities 
to port it back to 0.11? It seems 0.11 doesn't compile under jdk7 now.

 JDBC2 won't compile with JDK7
 -

 Key: HIVE-4496
 URL: https://issues.apache.org/jira/browse/HIVE-4496
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Chris Drome
Assignee: Chris Drome
 Fix For: 0.12.0

 Attachments: HIVE-4496-1.patch, HIVE-4496-2.patch, HIVE-4496.patch


 HiveServer2 related JDBC does not compile with JDK7. Related to HIVE-3384.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Hive-trunk-hadoop2 - Build # 613 - Still Failing

2013-12-19 Thread Apache Jenkins Server
Changes for Build #573
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #574
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #575
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #576

Changes for Build #577

Changes for Build #578

Changes for Build #579
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #580
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #581
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #582
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #583
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #584
[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)

[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #585
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)


Changes for Build #586
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #587
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #588

Changes for Build #589

Changes for Build #590
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #592
[hashutosh] HIVE-5982 : 

Re: Review Request 16339: HIVE-6052 metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16339/#review30694
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/16339/#comment58798

This should be on by default. After HIVE-5297 storing column values in 
non-canonical forms should not happen.



ql/src/test/queries/clientpositive/alter_partition_coltype.q
https://reviews.apache.org/r/16339/#comment58799

This should work even with config set to true. No?


- Ashutosh Chauhan


On Dec. 19, 2013, 2:17 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16339/
 ---
 
 (Updated Dec. 19, 2013, 2:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 a98d9d1 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 04d399f 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  93e9942 
   ql/src/test/queries/clientpositive/alter_partition_coltype.q 5479afb 
   ql/src/test/queries/clientpositive/annotate_stats_part.q 83510e3 
   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q 397a220 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 27b1fbc 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out 87fb980 
   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
 baee525 
 
 Diff: https://reviews.apache.org/r/16339/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Updated] (HIVE-6052) metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6052:
---

Status: Open  (was: Patch Available)

Few comments on RB.

 metastore JDO filter pushdown for integers may produce unexpected results 
 with non-normalized integer columns
 -

 Key: HIVE-6052
 URL: https://issues.apache.org/jira/browse/HIVE-6052
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6052.01.patch, HIVE-6052.patch


 If integer partition columns have values stores in non-canonical form, for 
 example with leading zeroes, the integer filter doesn't work. That is because 
 JDO pushdown uses substrings to compare for equality, and SQL pushdown is 
 intentionally crippled to do the same to produce same results.
 Probably, since both SQL pushdown and integers pushdown are just perf 
 optimizations, we can remove it for JDO (or make configurable and disable by 
 default), and uncripple SQL.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5951) improve performance of adding partitions from client

2013-12-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852835#comment-13852835
 ] 

Ashutosh Chauhan commented on HIVE-5951:


Can you comment why we need to add new api in metastore thrift interface? This 
is a public api and there should be strong justification why existing api 
(add_partitions()) doesn't fit the requirement and new one is required.

 improve performance of adding partitions from client
 

 Key: HIVE-5951
 URL: https://issues.apache.org/jira/browse/HIVE-5951
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5951.01.patch, HIVE-5951.02.patch, 
 HIVE-5951.03.patch, HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, 
 HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, HIVE-5951.patch


 Adding partitions to metastore is currently very inefficient. There are small 
 things like, for !ifNotExists case, DDLSemanticAnalyzer gets the full 
 partition object for every spec (which is a network call to metastore), and 
 then discards it instantly; there's also general problem that too much 
 processing is done on client side. DDLSA should analyze the query and make 
 one call to metastore (or maybe a set of batched  calls if there are too many 
 partitions in the command), metastore should then figure out stuff and insert 
 in batch.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-19 Thread Sun Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Rui updated HIVE-5891:
--

Description: 
Use the following test case with HIVE 0.12:

{code:sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
) x;
{code:sql}

We will get a NullPointerException from Union Operator:

{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row {_col0:0}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {_col0:0}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
... 4 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
... 5 more
{noformat}
  
The root cause is in 
CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
{noformat}
  +--+  +--+
  | MapJoin task |  | MapJoin task |
  +--+  +--+
 \ /
  \   /
 +--+
 |  Union task  |
 +--+
{noformat} 
CommonJoinTaskDispatcher merges the two MapJoin tasks into their common child: 
Union task. The two MapJoin tasks have the same alias name for their big 
tables: $INTNAME, which is the name of the temporary table of a join stream. 
The aliasToWork map uses alias as key, so eventually only the MapJoin operator 
tree of one MapJoin task is saved into the aliasToWork map of the Union task, 
while the MapJoin operator tree of another MapJoin task is lost. As a result, 
Union operator won't be initialized because not all of its parents gets 
intialized (The Union operator itself indicates it has two parents, but 
actually it has only 1 parent because another parent is lost).

This issue does not exist in HIVE 0.11 and thus is a regression bug in HIVE 
0.12.

The propsed solution is to use the query ID as prefix for the join stream name 
to avoid conflict and add sanity check code in CommonJoinTaskDispatcher that 
merge of a MapJoin task into its child MapRed task is skipped if there is any 
alias conflict. Please review the patch. I am not sure if the patch properly 
handles the case of DemuxOperator.

BTW, anyone knows the origin of $INTNAME? it is so confusing, maybe we can 
replace it with a meaningful name.

  was:
Use the following test case with HIVE 0.12:

{sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key 

[jira] [Updated] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-19 Thread Sun Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Rui updated HIVE-5891:
--

Description: 
Use the following test case with HIVE 0.12:

{sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
) x;
{sql}

We will get a NullPointerException from Union Operator:

{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row {_col0:0}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {_col0:0}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
... 4 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
... 5 more
{noformat}
  
The root cause is in 
CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
{noformat}
  +--+  +--+
  | MapJoin task |  | MapJoin task |
  +--+  +--+
 \ /
  \   /
 +--+
 |  Union task  |
 +--+
{noformat} 
CommonJoinTaskDispatcher merges the two MapJoin tasks into their common child: 
Union task. The two MapJoin tasks have the same alias name for their big 
tables: $INTNAME, which is the name of the temporary table of a join stream. 
The aliasToWork map uses alias as key, so eventually only the MapJoin operator 
tree of one MapJoin task is saved into the aliasToWork map of the Union task, 
while the MapJoin operator tree of another MapJoin task is lost. As a result, 
Union operator won't be initialized because not all of its parents gets 
intialized (The Union operator itself indicates it has two parents, but 
actually it has only 1 parent because another parent is lost).

This issue does not exist in HIVE 0.11 and thus is a regression bug in HIVE 
0.12.

The propsed solution is to use the query ID as prefix for the join stream name 
to avoid conflict and add sanity check code in CommonJoinTaskDispatcher that 
merge of a MapJoin task into its child MapRed task is skipped if there is any 
alias conflict. Please review the patch. I am not sure if the patch properly 
handles the case of DemuxOperator.

BTW, anyone knows the origin of $INTNAME? it is so confusing, maybe we can 
replace it with a meaningful name.

  was:
Use the following test case with HIVE 0.12:

{quote}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key from

[jira] [Updated] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-19 Thread Sun Rui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Rui updated HIVE-5891:
--

Description: 
Use the following test case with HIVE 0.12:

{code:sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
) x;
{code}

We will get a NullPointerException from Union Operator:

{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row {_col0:0}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {_col0:0}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
... 4 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
... 5 more
{noformat}
  
The root cause is in 
CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
{noformat}
  +--+  +--+
  | MapJoin task |  | MapJoin task |
  +--+  +--+
 \ /
  \   /
 +--+
 |  Union task  |
 +--+
{noformat} 
CommonJoinTaskDispatcher merges the two MapJoin tasks into their common child: 
Union task. The two MapJoin tasks have the same alias name for their big 
tables: $INTNAME, which is the name of the temporary table of a join stream. 
The aliasToWork map uses alias as key, so eventually only the MapJoin operator 
tree of one MapJoin task is saved into the aliasToWork map of the Union task, 
while the MapJoin operator tree of another MapJoin task is lost. As a result, 
Union operator won't be initialized because not all of its parents gets 
intialized (The Union operator itself indicates it has two parents, but 
actually it has only 1 parent because another parent is lost).

This issue does not exist in HIVE 0.11 and thus is a regression bug in HIVE 
0.12.

The propsed solution is to use the query ID as prefix for the join stream name 
to avoid conflict and add sanity check code in CommonJoinTaskDispatcher that 
merge of a MapJoin task into its child MapRed task is skipped if there is any 
alias conflict. Please review the patch. I am not sure if the patch properly 
handles the case of DemuxOperator.

BTW, anyone knows the origin of $INTNAME? it is so confusing, maybe we can 
replace it with a meaningful name.

  was:
Use the following test case with HIVE 0.12:

{code:sql}
create table src(key int, value string);
load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
select * from (
  select c.key from
(select a.key from src a join src b on a.key=b.key group by a.key) tmp
join src c on tmp.key=c.key
  union all
  select c.key 

[jira] [Commented] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852861#comment-13852861
 ] 

Ashutosh Chauhan commented on HIVE-6041:


+1

 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Assignee: Navis
Priority: Critical
 Attachments: HIVE-6041.1.patch.txt


 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all the tasks following the 
 common join are filtered out at runtime.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query (e.g., move stage which writes down the results into 
 the result table) are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6058) Creating Table with column name exchange gives exception

2013-12-19 Thread Anilkumar Kalshetti (JIRA)
Anilkumar Kalshetti created HIVE-6058:
-

 Summary: Creating Table with column nameexchange  gives 
exception
 Key: HIVE-6058
 URL: https://issues.apache.org/jira/browse/HIVE-6058
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
 Environment: Hadoop 2.0.1-beta, Hive 0.12 , running hiveserver , 
Installed on CentOS6.4 64-bit OS in Multinode configuration
Reporter: Anilkumar Kalshetti


1. Create Table using below script
CREATE TABLE test_column_name2  ( c1   string) ROW FORMAT DELIMITED STORED AS 
SEQUENCEFILE;

2. Table is created successfully.
Now Create another Table with column name exchange
by using below script

CREATE TABLE test_column_name  ( exchange   string) ROW FORMAT DELIMITED 
STORED AS SEQUENCEFILE;

3. This gives exception for column name exchange, for Hive 0.12
4. The script properly works for Hive 0.11





--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6058) Creating Table with column name exchange gives exception

2013-12-19 Thread Anilkumar Kalshetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Kalshetti updated HIVE-6058:
--

Attachment: table_columnname_exchange_exception.png

Hive 0.12

 Creating Table with column nameexchange  gives exception
 

 Key: HIVE-6058
 URL: https://issues.apache.org/jira/browse/HIVE-6058
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
 Environment: Hadoop 2.0.1-beta, Hive 0.12 , running hiveserver , 
 Installed on CentOS6.4 64-bit OS in Multinode configuration
Reporter: Anilkumar Kalshetti
 Attachments: table_columnname_exchange_exception.png, 
 table_columnname_exchange_exception.png


 1. Create Table using below script
 CREATE TABLE test_column_name2  ( c1   string) ROW FORMAT DELIMITED STORED AS 
 SEQUENCEFILE;
 2. Table is created successfully.
 Now Create another Table with column name exchange
 by using below script
 CREATE TABLE test_column_name  ( exchange string) ROW FORMAT DELIMITED 
 STORED AS SEQUENCEFILE;
 3. This gives exception for column name exchange, for Hive 0.12
 4. The script properly works for Hive 0.11



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6058) Creating Table with column name exchange gives exception

2013-12-19 Thread Anilkumar Kalshetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Kalshetti updated HIVE-6058:
--

Attachment: table_columnname_exchange_exception.png

Hive 0.12, connected to hiveserver

 Creating Table with column nameexchange  gives exception
 

 Key: HIVE-6058
 URL: https://issues.apache.org/jira/browse/HIVE-6058
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
 Environment: Hadoop 2.0.1-beta, Hive 0.12 , running hiveserver , 
 Installed on CentOS6.4 64-bit OS in Multinode configuration
Reporter: Anilkumar Kalshetti
 Attachments: table_columnname_exchange_exception.png, 
 table_columnname_exchange_exception.png


 1. Create Table using below script
 CREATE TABLE test_column_name2  ( c1   string) ROW FORMAT DELIMITED STORED AS 
 SEQUENCEFILE;
 2. Table is created successfully.
 Now Create another Table with column name exchange
 by using below script
 CREATE TABLE test_column_name  ( exchange string) ROW FORMAT DELIMITED 
 STORED AS SEQUENCEFILE;
 3. This gives exception for column name exchange, for Hive 0.12
 4. The script properly works for Hive 0.11



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6058) Creating Table with column name exchange gives exception

2013-12-19 Thread Anilkumar Kalshetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anilkumar Kalshetti updated HIVE-6058:
--

Attachment: (was: table_columnname_exchange_exception.png)

 Creating Table with column nameexchange  gives exception
 

 Key: HIVE-6058
 URL: https://issues.apache.org/jira/browse/HIVE-6058
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
 Environment: Hadoop 2.0.1-beta, Hive 0.12 , running hiveserver , 
 Installed on CentOS6.4 64-bit OS in Multinode configuration
Reporter: Anilkumar Kalshetti
 Attachments: table_columnname_exchange_exception.png


 1. Create Table using below script
 CREATE TABLE test_column_name2  ( c1   string) ROW FORMAT DELIMITED STORED AS 
 SEQUENCEFILE;
 2. Table is created successfully.
 Now Create another Table with column name exchange
 by using below script
 CREATE TABLE test_column_name  ( exchange string) ROW FORMAT DELIMITED 
 STORED AS SEQUENCEFILE;
 3. This gives exception for column name exchange, for Hive 0.12
 4. The script properly works for Hive 0.11



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Hive-trunk-h0.21 - Build # 2514 - Still Failing

2013-12-19 Thread Apache Jenkins Server
Changes for Build #2474
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #2475
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #2476
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #2477

Changes for Build #2478

Changes for Build #2479

Changes for Build #2480
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #2481

Changes for Build #2482
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #2483
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #2484
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #2485
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #2486
[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #2487
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)

[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)


Changes for Build #2488
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #2489
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #2490

Changes for Build #2491

Changes for Build #2492
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes 

[jira] [Commented] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-19 Thread Sun Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852894#comment-13852894
 ] 

Sun Rui commented on HIVE-5891:
---

[~yhuai] Thanks for your comments:)
It seems there is no quick way to get the correct QB for a join operator. 
However, since QBJoinTree's id is same as it's QB's id, we can just use 
QBJoinTree's id as QB's id. I found simliar usages in the HIVE code base.

I note that something wrong in my first patch. parseCtx.getJoinContext() is 
used to map the join operator to its QBJoinTree, but this is not enough. A join 
operator may be a MapJoin or SMBJoin operator instead of a common join 
operator, so parseCtx.getMapJoinContext() and parseCtx.getSmbMapJoinContext() 
should also be checked.

I have another thought. Maybe we can define a method in GenMapRedUtils. On each 
invokation, this method returns a unique intermediate name. Thus we don't have 
to use QB's id.

What's your opinion?

 Alias conflict when merging multiple mapjoin tasks into their common child 
 mapred task
 --

 Key: HIVE-5891
 URL: https://issues.apache.org/jira/browse/HIVE-5891
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Sun Rui
Assignee: Sun Rui
 Attachments: HIVE-5891.1.patch


 Use the following test case with HIVE 0.12:
 {code:sql}
 create table src(key int, value string);
 load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
 select * from (
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
   union all
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
 ) x;
 {code}
 We will get a NullPointerException from Union Operator:
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row {_col0:0}
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {_col0:0}
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
   ... 4 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
   ... 5 more
 {noformat}
   
 The root cause is in 
 CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
 {noformat}
   +--+  +--+
   | MapJoin task |  | MapJoin task |
   +--+  +--+
  \ /
   \   /
  +--+
  |  Union task  |
  +--+
 {noformat} 
 CommonJoinTaskDispatcher merges the two MapJoin tasks into their common 
 child: Union task. The two MapJoin tasks have the same alias name for their 
 big tables: $INTNAME, which is the name of the temporary table of a join 

[jira] [Commented] (HIVE-4216) TestHBaseMinimrCliDriver throws weird error with HBase 0.94.5 and Hadoop 23 and test is stuck infinitely

2013-12-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852912#comment-13852912
 ] 

Brock Noland commented on HIVE-4216:


+1

In 0.11 and 0.12 we use ant to build hive and path the file is: 
shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java

 TestHBaseMinimrCliDriver throws weird error with HBase 0.94.5 and Hadoop 23 
 and test is stuck infinitely
 

 Key: HIVE-4216
 URL: https://issues.apache.org/jira/browse/HIVE-4216
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.9.0, 0.11.0, 0.12.0
 Environment: Hadoop 23.X
Reporter: Viraj Bhat
 Fix For: 0.13.0

 Attachments: HIVE-4216.1.patch


 After upgrading to Hadoop 23 and HBase 0.94.5 compiled for Hadoop 23. The 
 TestHBaseMinimrCliDriver, fails after performing the following steps
 Update hbase_bulk.m with the following properties
 set mapreduce.totalorderpartitioner.naturalorder=false;
 set mapreduce.totalorderpartitioner.path=/tmp/hbpartition.lst;
 Otherwise I keep seeing: _partition.lst not found exception in the mappers, 
 even though set total.order.partitioner.path=/tmp/hbpartition.lst is set.
 When the test runs, the 3 reducer phase of the second query fails with the 
 following error, but the MiniMRCluster keeps spinning up new reducer and the 
 test is stuck infinitely.
 {code}
 insert overwrite table hbsort
  select distinct value,
   case when key=103 then cast(null as string) else key end,
   case when key=103 then ''
else cast(key+1 as string) end
  from src
  cluster by value;
 {code}
 The stack trace I see in the syslog for the Node Manager is the following:
 ==
 13-03-20 16:26:48,942 FATAL [IPC Server handler 17 on 55996] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1363821864968_0003_r_02_0 - exited : java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row (tag=0) 
 {key:{reducesinkkey0:val_200},value:{_col0:val_200,_col1:200,_col2:201.0},alias:0}
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
 at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:448)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) 
 {key:{reducesinkkey0:val_200},value:{_col0:val_200,_col1:200,_col2:201.0},alias:0}
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256)
 ... 7 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:525)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
 at 
 org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45)
 at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
 at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247)
 ... 7 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.mapreduce.TaskID$CharTaskTypeMaps.getRepresentingCharacter(TaskID.java:265)
 at org.apache.hadoop.mapreduce.TaskID.appendTo(TaskID.java:153)
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.appendTo(TaskAttemptID.java:119)
 at 
 org.apache.hadoop.mapreduce.TaskAttemptID.toString(TaskAttemptID.java:151)
 at java.lang.String.valueOf(String.java:2826)
 at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.getTaskAttemptPath(FileOutputCommitter.java:209)
 at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.init(FileOutputCommitter.java:69)
 at 
 org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.getRecordWriter(HFileOutputFormat.java:90)
 at 
 

[jira] [Created] (HIVE-6059) Add union type support in LazyBinarySerDe

2013-12-19 Thread Chaoyu Tang (JIRA)
Chaoyu Tang created HIVE-6059:
-

 Summary: Add union type support in LazyBinarySerDe
 Key: HIVE-6059
 URL: https://issues.apache.org/jira/browse/HIVE-6059
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.12.0
Reporter: Chaoyu Tang


We need the support to type union in LazyBinarySerDe, which is required to the 
join query with any union types in its select values. The reduce values in Join 
operation is serialized/deserialized using LazyBinarySerDe, otherwise we will 
see some errors like:
{code}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:106)
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:156)
at 
org.apache.hadoop.hive.ql.exec.JoinUtil.getStandardObjectInspectors(JoinUtil.java:98)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:261)
at 
org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:61)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
{code}




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-2508) Join on union type fails

2013-12-19 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852944#comment-13852944
 ] 

Chaoyu Tang commented on HIVE-2508:
---

I believe the observed error was due to the lack of union type support in 
LazyBinarySerDe which is used to deserialize the reduce values in Join, rather 
than the join on the key of union type. So any join query with select values 
having union type (e.g. SELECT * FROM DEST1 JOIN DEST2 on (DEST1.value = 
DEST2.value) should failed with same NPE.


 Join on union type fails
 

 Key: HIVE-2508
 URL: https://issues.apache.org/jira/browse/HIVE-2508
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ashutosh Chauhan
  Labels: uniontype

 {code}
 hive CREATE TABLE DEST1(key UNIONTYPESTRING, STRING, value BIGINT) STORED 
 AS TEXTFILE;
 OK
 Time taken: 0.076 seconds
 hive CREATE TABLE DEST2(key UNIONTYPESTRING, STRING, value BIGINT) STORED 
 AS TEXTFILE;
 OK
 Time taken: 0.034 seconds
 hive SELECT * FROM DEST1 JOIN DEST2 on (DEST1.key = DEST2.key);
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5966) Fix eclipse:eclipse post shim aggregation changes

2013-12-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5966:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Great fix!! Thank you for very much for the contribution!

 Fix eclipse:eclipse post shim aggregation changes
 -

 Key: HIVE-5966
 URL: https://issues.apache.org/jira/browse/HIVE-5966
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
 Fix For: 0.13.0

 Attachments: HIVE-5966.1.patch, HIVE-5966.patch


 The shim bundle module marks it's deps provided so users of the bundle won't 
 pull in the child dependencies. This causes the eclipse workspace generated 
 by eclipse:eclipse to fail because it only includes the source from the 
 bundle source directory, which is empty.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6013:


Status: Open  (was: Patch Available)

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6013:


Attachment: HIVE-6013.7.patch

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6013:


Status: Patch Available  (was: Open)

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16299: HIVE-6013: Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16299/
---

(Updated Dec. 19, 2013, 4:37 p.m.)


Review request for hive, Ashutosh Chauhan and Alan Gates.


Changes
---

fix test diffs introduced because of turning on this feature.


Bugs: HIVE-6013
https://issues.apache.org/jira/browse/HIVE-6013


Repository: hive-git


Description
---

Hive's current behavior on Quoted Identifiers is different from the normal 
interpretation. Quoted Identifier (using backticks) has a special 
interpretation for Select expressions(as Regular Expressions). Have documented 
current behavior and proposed a solution in attached doc.
Summary of solution is:
Introduce 'standard' quoted identifiers for columns only.
At the langauage level this is turned on by a flag.
At the metadata level we relax the constraint on column names.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
  itests/qtest/pom.xml 971c5d3 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
5b75ef3 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java eb26e7f 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 321759b 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java dbf3f91 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g ed9917d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 1e6826f 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b9cd65c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java 8fe2262 
  ql/src/test/queries/clientnegative/ambiguous_col1.q fdf20f8 
  ql/src/test/queries/clientnegative/ambiguous_col2.q de59bc5 
  ql/src/test/queries/clientnegative/invalid_columns.q f8be8c8 
  ql/src/test/queries/clientnegative/regex_col_1.q 8333ddc 
  ql/src/test/queries/clientnegative/regex_col_2.q d1aa1f1 
  ql/src/test/queries/clientnegative/regex_col_groupby.q 5397191 
  ql/src/test/queries/clientpositive/ambiguous_col.q e7053c1 
  ql/src/test/queries/clientpositive/quotedid_alter.q PRE-CREATION 
  ql/src/test/queries/clientpositive/quotedid_basic.q PRE-CREATION 
  ql/src/test/queries/clientpositive/quotedid_partition.q PRE-CREATION 
  ql/src/test/queries/clientpositive/quotedid_skew.q PRE-CREATION 
  ql/src/test/queries/clientpositive/quotedid_smb.q PRE-CREATION 
  ql/src/test/queries/clientpositive/quotedid_tblproperty.q PRE-CREATION 
  ql/src/test/queries/clientpositive/regex_col.q 9cfcee5 
  ql/src/test/queries/clientpositive/show_tablestatus.q 9184d6d 
  ql/src/test/queries/clientpositive/udf_index.q 9079d0e 
  ql/src/test/results/clientnegative/invalid_columns.q.out 3311b0a 
  ql/src/test/results/clientnegative/invalidate_view1.q.out 9f3870e 
  ql/src/test/results/clientnegative/regex_col_1.q.out 2025aee 
  ql/src/test/results/clientnegative/regex_col_2.q.out 171a66f 
  ql/src/test/results/clientnegative/regex_col_groupby.q.out 0730f14 
  ql/src/test/results/clientpositive/auto_join_reordering_values.q.out d61f5d2 
  ql/src/test/results/clientpositive/escape_clusterby1.q.out f1878dc 
  ql/src/test/results/clientpositive/escape_distributeby1.q.out c71a0d8 
  ql/src/test/results/clientpositive/escape_orderby1.q.out 70a0bf8 
  ql/src/test/results/clientpositive/escape_sortby1.q.out 3530622 
  ql/src/test/results/clientpositive/index_bitmap3.q.out 3bda9bd 
  ql/src/test/results/clientpositive/index_bitmap_auto.q.out 2cbbf00 
  ql/src/test/results/clientpositive/quote1.q.out 9ac271d 
  ql/src/test/results/clientpositive/quotedid_alter.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/quotedid_basic.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/quotedid_partition.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/quotedid_skew.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/quotedid_smb.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/quotedid_tblproperty.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/16299/diff/


Testing
---

added new tests for create, alter, delete, query with columns containing 
special characters.
Tests start with quotedid


Thanks,

Harish Butani



Hive-trunk-hadoop2 - Build # 614 - Still Failing

2013-12-19 Thread Apache Jenkins Server
Changes for Build #574
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #575
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #576

Changes for Build #577

Changes for Build #578

Changes for Build #579
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #580
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #581
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #582
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #583
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #584
[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)

[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #585
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)


Changes for Build #586
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #587
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #588

Changes for Build #589

Changes for Build #590
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #592
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #593
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #594
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : Hadoop23Shims has a bug in listLocatedStatus impl. 

Hive-trunk-h0.21 - Build # 2515 - Still Failing

2013-12-19 Thread Apache Jenkins Server
Changes for Build #2475
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #2476
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #2477

Changes for Build #2478

Changes for Build #2479

Changes for Build #2480
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #2481

Changes for Build #2482
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #2483
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #2484
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #2485
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #2486
[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #2487
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)

[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)


Changes for Build #2488
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #2489
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #2490

Changes for Build #2491

Changes for Build #2492
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : Remove dummy new line at the end of non-sql commands 
(Navis via Ashutosh Chauhan)


Changes for Build #2494
[hashutosh] HIVE-5982 : Remove redundant filesystem operations and methods in 
FileSink (Ashutosh Chauhan via Thejas Nair)

[navis] HIVE-5955 : decimal_precision.q test case fails in trunk (Prasanth J 
via Navis)

[brock] HIVE-5983 - Fix name of ColumnProjectionUtils.appendReadColumnIDs 
(Brock Noland reviewed by Navis)


Changes for Build #2495
[omalley] HIVE-5580. Predicate pushdown predicates with an and-operator between 
non-SARGable predicates cause a NPE. (omalley)


Changes for Build #2496
[gunther] HIVE-6000: Hive build broken on hadoop2 (Vikram Dixit K via Gunther 
Hagleitner

[gunther] HIVE-2093: UPDATE - add two missing files from previous commit 
(Gunther Hagleitner)

[thejas] HIVE-2093 : create/drop database should populate inputs/outputs and 
check concurrency and user permission (Navis via Thejas Nair)

[hashutosh] HIVE-6016 : 

[jira] [Created] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2013-12-19 Thread Owen O'Malley (JIRA)
Owen O'Malley created HIVE-6060:
---

 Summary: Define API for RecordUpdater and UpdateReader
 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley


We need to define some new APIs for how Hive interacts with the file formats 
since it needs to be much richer than the current RecordReader and RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2013-12-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6060:


Attachment: h-5317.patch

Here's a first cut of what the RecordUpdater looks like.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: h-5317.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853060#comment-13853060
 ] 

Ashutosh Chauhan commented on HIVE-6013:


+1

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6041:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Assignee: Navis
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6041.1.patch.txt


 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all the tasks following the 
 common join are filtered out at runtime.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query (e.g., move stage which writes down the results into 
 the result table) are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5951) improve performance of adding partitions from client

2013-12-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853070#comment-13853070
 ] 

Sergey Shelukhin commented on HIVE-5951:


New parameters are needed for the API, and they cannot be added in backward 
compatible manner.
Overall, it's a good idea to have APIs in request-response pattern, and not 
with separate parameters due to this issue... let me file brainstorming jira 
for that.

 improve performance of adding partitions from client
 

 Key: HIVE-5951
 URL: https://issues.apache.org/jira/browse/HIVE-5951
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5951.01.patch, HIVE-5951.02.patch, 
 HIVE-5951.03.patch, HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, 
 HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, HIVE-5951.patch


 Adding partitions to metastore is currently very inefficient. There are small 
 things like, for !ifNotExists case, DDLSemanticAnalyzer gets the full 
 partition object for every spec (which is a network call to metastore), and 
 then discards it instantly; there's also general problem that too much 
 processing is done on client side. DDLSA should analyze the query and make 
 one call to metastore (or maybe a set of batched  calls if there are too many 
 partitions in the command), metastore should then figure out stuff and insert 
 in batch.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6061) Metastore (and other) Thrift APIs should use request-response pattern

2013-12-19 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-6061:
--

 Summary: Metastore (and other) Thrift APIs should use 
request-response pattern
 Key: HIVE-6061
 URL: https://issues.apache.org/jira/browse/HIVE-6061
 Project: Hive
  Issue Type: Wish
  Components: Metastore, Thrift API
Reporter: Sergey Shelukhin


Wish in lieu of brainstorming JIRA.

Metastore Thrift APIs currently use normal method signatures (e.g. int 
foo(string bar, double baz)); this is problematic in Thrift because the APIs 
cannot be evolved without breaking compat; and weird names have to be invented 
because overloading is not supported either.
An easy solution to this is to have methods in the form of FooResponse 
foo(FooRequest req); the structures can then be evolved easily.

This may apply also to other Thrift APIs, I have not checked.

This is a brainstorming JIRA for the transformations. Obviously this will 
either double the API size, or cause massive backward incompatibility. 
Maybe we can do 1-2 releases with both APIs, marking the old ones deprecated in 
some form, and then remove them?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5951) improve performance of adding partitions from client

2013-12-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853074#comment-13853074
 ] 

Sergey Shelukhin commented on HIVE-5951:


Filed HIVE-6061

 improve performance of adding partitions from client
 

 Key: HIVE-5951
 URL: https://issues.apache.org/jira/browse/HIVE-5951
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-5951.01.patch, HIVE-5951.02.patch, 
 HIVE-5951.03.patch, HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, 
 HIVE-5951.nogen.patch, HIVE-5951.nogen.patch, HIVE-5951.patch


 Adding partitions to metastore is currently very inefficient. There are small 
 things like, for !ifNotExists case, DDLSemanticAnalyzer gets the full 
 partition object for every spec (which is a network call to metastore), and 
 then discards it instantly; there's also general problem that too much 
 processing is done on client side. DDLSA should analyze the query and make 
 one call to metastore (or maybe a set of batched  calls if there are too many 
 partitions in the command), metastore should then figure out stuff and insert 
 in batch.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16339: HIVE-6052 metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-19 Thread Sergey Shelukhin


 On Dec. 19, 2013, 11:43 a.m., Ashutosh Chauhan wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 345
  https://reviews.apache.org/r/16339/diff/2/?file=399690#file399690line345
 
  This should be on by default. After HIVE-5297 storing column values in 
  non-canonical forms should not happen.

As far as I see, the jira doesn't do normalization, just verification. On trunk 
the issue still happens


 On Dec. 19, 2013, 11:43 a.m., Ashutosh Chauhan wrote:
  ql/src/test/queries/clientpositive/alter_partition_coltype.q, line 1
  https://reviews.apache.org/r/16339/diff/2/?file=399694#file399694line1
 
  This should work even with config set to true. No?

yeah, forgot to remove


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16339/#review30694
---


On Dec. 19, 2013, 2:17 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16339/
 ---
 
 (Updated Dec. 19, 2013, 2:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 a98d9d1 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 04d399f 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  93e9942 
   ql/src/test/queries/clientpositive/alter_partition_coltype.q 5479afb 
   ql/src/test/queries/clientpositive/annotate_stats_part.q 83510e3 
   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q 397a220 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 27b1fbc 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out 87fb980 
   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
 baee525 
 
 Diff: https://reviews.apache.org/r/16339/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Updated] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-19 Thread Kostiantyn Kudriavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostiantyn Kudriavtsev updated HIVE-6006:
-

Attachment: hive-6006.patch

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-19 Thread Kostiantyn Kudriavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostiantyn Kudriavtsev updated HIVE-6006:
-

Status: Patch Available  (was: Open)

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6058) Creating Table with column name exchange gives exception

2013-12-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853082#comment-13853082
 ] 

Xuefu Zhang commented on HIVE-6058:
---

exchange became a reserved keyword in hive, so the error is expected.

 Creating Table with column nameexchange  gives exception
 

 Key: HIVE-6058
 URL: https://issues.apache.org/jira/browse/HIVE-6058
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
 Environment: Hadoop 2.0.1-beta, Hive 0.12 , running hiveserver , 
 Installed on CentOS6.4 64-bit OS in Multinode configuration
Reporter: Anilkumar Kalshetti
 Attachments: table_columnname_exchange_exception.png


 1. Create Table using below script
 CREATE TABLE test_column_name2  ( c1   string) ROW FORMAT DELIMITED STORED AS 
 SEQUENCEFILE;
 2. Table is created successfully.
 Now Create another Table with column name exchange
 by using below script
 CREATE TABLE test_column_name  ( exchange string) ROW FORMAT DELIMITED 
 STORED AS SEQUENCEFILE;
 3. This gives exception for column name exchange, for Hive 0.12
 4. The script properly works for Hive 0.11



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-3299) UDF DAYNAME(date) to HIVE

2013-12-19 Thread Kostiantyn Kudriavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853087#comment-13853087
 ] 

Kostiantyn Kudriavtsev commented on HIVE-3299:
--

Please, take into account this is very specific user case. It makes sense to 
create as general as possible public to cover wide range of use cases. A few 
days ago I create ticket HIVE-6046 and working on that currently
I'm pretty sure HIVE-6046 gives possibility to get name from date as described 
in the current ticket 

 UDF  DAYNAME(date) to HIVE 
 ---

 Key: HIVE-3299
 URL: https://issues.apache.org/jira/browse/HIVE-3299
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.9.0
Reporter: Namitha Babychan
Assignee: Carl Steinbach
  Labels: patch
 Attachments: HIVE-3299.1.patch.txt, HIVE-3299.patch.txt, 
 Hive-3299_Testcase.doc, udf_dayname.q, udf_dayname.q.out


 Current releases of Hive lacks a function which would return the day name 
 corresponding to a date / timestamp value which might be a part of a column.  
  
 The function -DAYNAME (date) would return the day name from a date / 
 timestamp or column which would be useful while using HiveQL. This would find 
 its use  in various business sectors like retail, which would help in  
 identifying the trends and sales datails for a particular weekday for entire 
 year,month or week.
 Functionality :-
 Function Name: DAYNAME (date)

 Returns the name of the weekday for date. 
 Example: hive SELECT DAYNAME('2012-07-25');
- 'Wednesday'
 Usage :-
 Case 1 : To find DAY NAME corresponding to a particular date 
 hive SELECT DAYNAME('2012-07-25');
- 'Wednesday'
 Case 2 : To query a table to find details based on a particular day name
 Table :-
 date  |item id|store id|value|unit|price
 01/07/2012|110001|00003|0.99|1.00|0.99
 02/07/2012|110001|00008|0.99|0.00|0.00
 03/07/2012|110001|00009|0.99|0.00|0.00
 04/07/2012|110001|001112002|0.99|0.00|0.00
 05/07/2012|110001|001112003|0.99|0.00|0.00
 06/07/2012|110001|001112006|0.99|1.00|0.99
 07/07/2012|110001|001112007|0.99|0.00|0.00
 08/07/2012|110001|001112008|0.99|0.00|0.00
 09/07/2012|110001|001112009|0.99|0.00|0.00
 10/07/2012|110001|001112010|0.99|0.00|0.00
 11/07/2012|110001|001113003|0.99|0.00|0.00
 12/07/2012|110001|001113006|0.99|0.00|0.00
 13/07/2012|110001|001113008|0.99|0.00|0.00
 14/07/2012|110001|001113010|0.99|0.00|0.00
 15/07/2012|110001|001114002|0.99|0.00|0.00
 16/07/2012|110001|001114004|0.99|1.00|0.99
 17/07/2012|110001|001114005|0.99|0.00|0.00
 18/07/2012|110001|001121004|0.99|0.00|0.00
 Query : select * from sales where dayname(date)='wednesday';
 Result :-
 04/07/2012|110001|001112002|0.99|0.00|0.00
 11/07/2012|110001|001113003|0.99|0.00|0.00
 18/07/2012|110001|001121004|0.99|0.00|0.00



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16339: HIVE-6052 metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-19 Thread Ashutosh Chauhan


 On Dec. 19, 2013, 11:43 a.m., Ashutosh Chauhan wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 345
  https://reviews.apache.org/r/16339/diff/2/?file=399690#file399690line345
 
  This should be on by default. After HIVE-5297 storing column values in 
  non-canonical forms should not happen.
 
 Sergey Shelukhin wrote:
 As far as I see, the jira doesn't do normalization, just verification. On 
 trunk the issue still happens

So will it result in incorrect result or exception? If this is exception, I 
will suggest to have config on by default. If it silently results in incorrect 
results, than off by default is better.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16339/#review30694
---


On Dec. 19, 2013, 2:17 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16339/
 ---
 
 (Updated Dec. 19, 2013, 2:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 a98d9d1 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 04d399f 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  93e9942 
   ql/src/test/queries/clientpositive/alter_partition_coltype.q 5479afb 
   ql/src/test/queries/clientpositive/annotate_stats_part.q 83510e3 
   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q 397a220 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 27b1fbc 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out 87fb980 
   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
 baee525 
 
 Diff: https://reviews.apache.org/r/16339/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Commented] (HIVE-2361) Add some UDFs which help to migrate Oracle to Hive

2013-12-19 Thread Kostiantyn Kudriavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853089#comment-13853089
 ] 

Kostiantyn Kudriavtsev commented on HIVE-2361:
--

It seems useful, specially Scalar functions... does this code available 
somewhere?

  Add some UDFs which help to migrate Oracle to Hive
 ---

 Key: HIVE-2361
 URL: https://issues.apache.org/jira/browse/HIVE-2361
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.8.0
Reporter: JunHo Cho
Assignee: JunHo Cho
Priority: Minor
  Labels: features
 Attachments: nexr-udf.tar


 Here some UDFs which can be matched to oracle functions:
 There are two kinds of oracle functions. one is scalar function and another 
 is analytic function.
 Most scalar functions in Oracle can be converted to hive's udf directly.  
 Oracle Scalar Function
 GenericUDFDecode : Compares first argument to each other value one by one. 
 e.g., DECODE(x,0,'zero',1,'one') will return 'zero' if x is 0
 GenericUDFGreatest : Return the greatest of the list of one or more 
 expressions. e.g., GREATEST(2,5,12,3) will return 12
 GenericUDFInstr : Return the location of a substring in a string. e.g., 
 INSTR('next', 'e') will return 2
 GenericUDFLnnvl : Evaluate a condition when one or both operands of the 
 condition may be null. e.g., LNNVL(2  4) will return true
 GenericUDFNVL : Replace null with a string in the results of a query. e.g., 
 NVL(null,'hive') will return hive
 GenericUDFNVL2 : Determine the value returned by a query based on whether a 
 specified expression is null or not null. e.g., NVL2(null,'not null','null 
 value') will return 'null value'
 GenericUDFToNumber : Convert a string to a number. e.g., 
 TO_NUMBER('112','999') will return 112
 GenericUDFTrunc : Returns a date truncated to a specific unit of measure. 
 e.g., TRUNC('2002-11-02 01:01:01','') will return '2002-01-01 00:00:00'
 Oracle Analytic Function
 Most analytic functions in Oracle can't be converted to hive's query and udf 
 directly.
 Following udfs should be used with DISTRIBUTED, SORT BY and HASH of hive to 
 support analytic functions 
 e.q., SELECT _FUNC_(hash(col1), col2, ...) FROM SELECT ~ FROM table 
 DISTRIBUTED BY hash(col1) SORT BY col1, col2 ...
 GenericUDFSum : Calculate a cumulative sum.
 GenericUDFRank : Assign a sequential order, or rank within some group based 
 on key.
 GenericUDFDenseRank : Act like RANK function except that it assigns 
 consecutive ranks.
 GenericUDFRowNumber : Return sequence integer value within some group based 
 on key.
 GenericUDFMax : Determine the highest value within some group based on key.
 GenericUDFMin : Determine the lowest value within some group based on key.
 GenericUDFLag : Access data from a previous row.
 This udfs was developed with hive-pdk



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16339: HIVE-6052 metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-19 Thread Sergey Shelukhin


 On Dec. 19, 2013, 11:43 a.m., Ashutosh Chauhan wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 345
  https://reviews.apache.org/r/16339/diff/2/?file=399690#file399690line345
 
  This should be on by default. After HIVE-5297 storing column values in 
  non-canonical forms should not happen.
 
 Sergey Shelukhin wrote:
 As far as I see, the jira doesn't do normalization, just verification. On 
 trunk the issue still happens
 
 Ashutosh Chauhan wrote:
 So will it result in incorrect result or exception? If this is exception, 
 I will suggest to have config on by default. If it silently results in 
 incorrect results, than off by default is better.

incorrect result, as in the original bugs. Also, even verification doesn't 
cover all cases of adding partitions


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16339/#review30694
---


On Dec. 19, 2013, 2:17 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16339/
 ---
 
 (Updated Dec. 19, 2013, 2:17 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 a98d9d1 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 04d399f 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
  93e9942 
   ql/src/test/queries/clientpositive/alter_partition_coltype.q 5479afb 
   ql/src/test/queries/clientpositive/annotate_stats_part.q 83510e3 
   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q 397a220 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 27b1fbc 
   ql/src/test/results/clientpositive/annotate_stats_part.q.out 87fb980 
   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
 baee525 
 
 Diff: https://reviews.apache.org/r/16339/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13853252#comment-13853252
 ] 

Hive QA commented on HIVE-6013:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619598/HIVE-6013.7.patch

{color:green}SUCCESS:{color} +1 4799 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/703/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/703/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619598

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6064) Wincompat: windows path substitutions overridden by MiniMrShim.getConfiguration() on hadoop-2

2013-12-19 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6064:


 Summary: Wincompat: windows path substitutions overridden by 
MiniMrShim.getConfiguration() on hadoop-2
 Key: HIVE-6064
 URL: https://issues.apache.org/jira/browse/HIVE-6064
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Reporter: Jason Dere


On Windows, HiveConf setting hive.exec.scratchdir is changed to remove the 
drive letter (i.e. c:) from the start of its path. However, in HadoopShims23, 
MiniMrShim.setupConfiguration() subsequently overwrites the HiveConf settings 
and the drive letter is added back to hive.exec.scratchdir, causing path issues 
in the MiniMR tests.




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Review Request 16403: HIVE-5176: Wincompat : Changes for allowing various path compatibilities with Windows

2013-12-19 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16403/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-5176
https://issues.apache.org/jira/browse/HIVE-5176


Repository: hive-git


Description
---

We need to make certain changes across the board to allow us to read/parse 
windows paths. Some are escaping changes, some are being strict about how we 
read paths (through URL.encode/decode, etc)


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java f08a8b6 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
  common/src/test/org/apache/hadoop/hive/conf/TestHiveConf.java a31238b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 38d97e3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 5cb492f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 9afc80b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java d0be73e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b9cd65c 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 0684aac 
  ql/src/test/org/apache/hadoop/hive/ql/WindowsPathUtil.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java d4ad931 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
69d1896 

Diff: https://reviews.apache.org/r/16403/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Updated] (HIVE-6070) document HIVE-6052

2013-12-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6070:
---

Status: Patch Available  (was: Open)

 document HIVE-6052
 --

 Key: HIVE-6070
 URL: https://issues.apache.org/jira/browse/HIVE-6070
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: HIVE-6070.patch


 See comments in HIVE-6052 - this is the followup jira



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6077) Fixing a couple of orc unit tests on tez

2013-12-19 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-6077:


 Summary: Fixing a couple of orc unit tests on tez
 Key: HIVE-6077
 URL: https://issues.apache.org/jira/browse/HIVE-6077
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6047) Permanent UDFs in Hive

2013-12-19 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6047:
-

Attachment: PermanentFunctionsinHive.pdf

Attaching initial proposal

 Permanent UDFs in Hive
 --

 Key: HIVE-6047
 URL: https://issues.apache.org/jira/browse/HIVE-6047
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: PermanentFunctionsinHive.pdf


 Currently Hive only supports temporary UDFs which must be re-registered when 
 starting up a Hive session. Provide some support to register permanent UDFs 
 with Hive. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Review Request 16412: HIVE-6048: Hive load data command rejects file with '+' in the name

2013-12-19 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16412/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-6048
https://issues.apache.org/jira/browse/HIVE-6048


Repository: hive-git


Description
---

The issue, including its old cousin, is caused by encoding/decoding of 
URI.toString, Path.toString(), file path etc. It's found that the best approach 
is to use URI to represent a file and URI.getPath() gives the correct decoded 
file path. The fix in this patch is mostly about passing URI around so that 
acurate file info isn't lost during passing. 


Diffs
-

  data/files/person c902284 
  data/files/person+age.txt PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 38d97e3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 5cb492f 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java fd811f3 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java dbf3f91 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java 
33111e5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
e97d948 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java c2981e8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b9cd65c 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CopyWork.java de31b21 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadDesc.java bada915 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadFileDesc.java 40adca7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java 7d555e4 
  ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 672d5d2 
  ql/src/test/queries/clientpositive/load_hdfs_file_with_space_in_the_name.q 
d4520e2 
  ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
af6fd10 
  
ql/src/test/results/clientpositive/load_hdfs_file_with_space_in_the_name.q.out 
1e7fa33 
  ql/src/test/results/compiler/plan/case_sensitivity.q.xml 27d064f 
  ql/src/test/results/compiler/plan/groupby1.q.xml 00500bb 
  ql/src/test/results/compiler/plan/input1.q.xml 28a2237 
  ql/src/test/results/compiler/plan/input2.q.xml d96bfab 
  ql/src/test/results/compiler/plan/input3.q.xml 46fe7f9 
  ql/src/test/results/compiler/plan/input4.q.xml 98e28d4 
  ql/src/test/results/compiler/plan/input5.q.xml 806c3bf 
  ql/src/test/results/compiler/plan/input6.q.xml 8b2e348 
  ql/src/test/results/compiler/plan/input7.q.xml 8ae403b 
  ql/src/test/results/compiler/plan/input9.q.xml f8a2f76 
  ql/src/test/results/compiler/plan/input_testsequencefile.q.xml d8697ff 
  ql/src/test/results/compiler/plan/join1.q.xml 9e4b609 
  ql/src/test/results/compiler/plan/join2.q.xml efcb865 
  ql/src/test/results/compiler/plan/join3.q.xml 9bbe64f 
  ql/src/test/results/compiler/plan/sample2.q.xml 568cea8 
  ql/src/test/results/compiler/plan/sample3.q.xml c23313b 
  ql/src/test/results/compiler/plan/sample4.q.xml 568cea8 
  ql/src/test/results/compiler/plan/sample5.q.xml f60cb96 
  ql/src/test/results/compiler/plan/sample6.q.xml 5bb3dbc 
  ql/src/test/results/compiler/plan/sample7.q.xml 152cc08 
  ql/src/test/results/compiler/plan/subq.q.xml 8990b76 
  ql/src/test/results/compiler/plan/union.q.xml 6cab061 

Diff: https://reviews.apache.org/r/16412/diff/


Testing
---

New unit tests are added. Some old tests have newly generated output.


Thanks,

Xuefu Zhang



Review Request 16413: HIVE-5558: Allow from clause to join table sources with the `comma' token.

2013-12-19 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16413/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-5558
https://issues.apache.org/jira/browse/HIVE-5558


Repository: hive-git


Description
---

Allow from clause to join table sources with the `comma' token.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g 205604c 
  ql/src/test/queries/clientnegative/join_alt_syntax_comma_on.q PRE-CREATION 
  ql/src/test/queries/clientpositive/join_alt_syntax.q PRE-CREATION 
  ql/src/test/results/clientnegative/join_alt_syntax_comma_on.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/join_alt_syntax.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/16413/diff/


Testing
---

added new tests


Thanks,

Harish Butani



[jira] [Updated] (HIVE-5558) Support alternate join syntax

2013-12-19 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-5558:


Attachment: HIVE-5558.1.patch

 Support alternate join syntax
 -

 Key: HIVE-5558
 URL: https://issues.apache.org/jira/browse/HIVE-5558
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-5558.1.patch


 See details in HIVE-
 Allow from clause to join table sources with the `comma' token.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)