Re: Review Request 16184: Hive should be able to skip header and footer rows when reading data file for a table (HIVE-5795)

2013-12-17 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16184/#review30523
---



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
https://reviews.apache.org/r/16184/#comment58465

skipHeader and initiaizeFooterBuf can be moved to a common util class and 
shared. We just need to pass the member variables as additional params 



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
https://reviews.apache.org/r/16184/#comment58463

code such as this block for parsing header count can be moved to a util 
class and shared between the two places



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
https://reviews.apache.org/r/16184/#comment58464

code such as this block for parsing header count can be moved to a util 
class and shared between the two places



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
https://reviews.apache.org/r/16184/#comment58466

logic of this block also looks same in 2 places, can we move it to a common 
util function ?


- Thejas Nair


On Dec. 11, 2013, 9:19 p.m., Shuaishuai Nie wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16184/
 ---
 
 (Updated Dec. 11, 2013, 9:19 p.m.)
 
 
 Review request for hive, Eric Hanson and Thejas Nair.
 
 
 Bugs: hive-5795
 https://issues.apache.org/jira/browse/hive-5795
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Hive should be able to skip header and footer rows when reading data file for 
 a table
 (follow up with review https://reviews.apache.org/r/15663/diff/#index_header)
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   conf/hive-default.xml.template c61a0bb 
   data/files/header_footer_table_1/0001.txt PRE-CREATION 
   data/files/header_footer_table_1/0002.txt PRE-CREATION 
   data/files/header_footer_table_1/0003.txt PRE-CREATION 
   data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION 
   data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION 
   data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION 
   itests/qtest/pom.xml c3cbb89 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java d2b2526 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java 
 dd5cb6b 
   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 974a5d6 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java
  85dd975 
   ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
 0686d9b 
   ql/src/test/queries/clientnegative/file_with_header_footer_negative.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION 
   ql/src/test/results/clientnegative/file_with_header_footer_negative.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/file_with_header_footer.q.out 
 PRE-CREATION 
   serde/if/serde.thrift 2ceb572 
   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java
  22a6168 
 
 Diff: https://reviews.apache.org/r/16184/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Shuaishuai Nie
 




[jira] [Commented] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-17 Thread Sun Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850246#comment-13850246
 ] 

Sun Rui commented on HIVE-5891:
---

[~yhuai] I think we can leave $INTNAME as is for this issue. Do you have any 
further comments? if no, I can prepare a new patch for review.

 Alias conflict when merging multiple mapjoin tasks into their common child 
 mapred task
 --

 Key: HIVE-5891
 URL: https://issues.apache.org/jira/browse/HIVE-5891
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Sun Rui
Assignee: Sun Rui
 Attachments: HIVE-5891.1.patch


 Use the following test case with HIVE 0.12:
 {quote}
 create table src(key int, value string);
 load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
 select * from (
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
   union all
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
 ) x;
 {quote}
 We will get a NullPointerException from Union Operator:
 {quote}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row {_col0:0}
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {_col0:0}
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
   ... 4 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
   ... 5 more
 {quote}
   
 The root cause is in 
 CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
   +--+  +--+
   | MapJoin task |  | MapJoin task |
   +--+  +--+
  \ /
   \   /
  +--+
  |  Union task  |
  +--+
  
 CommonJoinTaskDispatcher merges the two MapJoin tasks into their common 
 child: Union task. The two MapJoin tasks have the same alias name for their 
 big tables: $INTNAME, which is the name of the temporary table of a join 
 stream. The aliasToWork map uses alias as key, so eventually only the MapJoin 
 operator tree of one MapJoin task is saved into the aliasToWork map of the 
 Union task, while the MapJoin operator tree of another MapJoin task is lost. 
 As a result, Union operator won't be initialized because not all of its 
 parents gets intialized (The Union operator itself indicates it has two 
 parents, but actually it has only 1 parent because another parent is lost).
 This issue does not exist in HIVE 0.11 and thus is a regression bug in HIVE 
 0.12.
 The propsed solution is to use the query ID as prefix for the join stream 
 name to avoid conflict and add sanity check code in 

[jira] [Created] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)
Adrian Popescu created HIVE-6041:


 Summary: Incorrect task dependency graph for skewed join 
optimization
 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical


The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which
maintains all these dependencies, but for the case the map join task is 
filtered out (i.e., no skewed keys exist), all these dependencies are lost. 
Hence, all the other task stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function,
immediately after the ConditionalTask is created and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Popescu updated HIVE-6041:
-

Description: 
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function,
immediately after the ConditionalTask is created and its dependencies are set.

  was:
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which
maintains all these dependencies, but for the case the map join task is 
filtered out (i.e., no skewed keys exist), all these dependencies are lost. 
Hence, all the other task stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function,
immediately after the ConditionalTask is created and its dependencies are set.


 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical

 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all tasks following the common 
 join are filtered out.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function,
 immediately after the ConditionalTask is created and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Popescu updated HIVE-6041:
-

Description: 
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.

  was:
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function,
immediately after the ConditionalTask is created and its dependencies are set.


 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical

 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all tasks following the common 
 join are filtered out.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Popescu updated HIVE-6041:
-

Description: 
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all the tasks following the common join 
are filtered out at runtime.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.

  was:
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all tasks following the common join are 
filtered out.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.


 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical

 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all the tasks following the 
 common join are filtered out at runtime.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Popescu updated HIVE-6041:
-

Description: 
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all the tasks following the common join 
are filtered out at runtime.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query (e.g., move stage which writes down the results into the 
result table) are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.

  was:
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all the tasks following the common join 
are filtered out at runtime.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query (e.g., move stage which writes the results into the result 
table) are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.


 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical

 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all the tasks following the 
 common join are filtered out at runtime.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query (e.g., move stage which writes down the results into 
 the result table) are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6041) Incorrect task dependency graph for skewed join optimization

2013-12-17 Thread Adrian Popescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Popescu updated HIVE-6041:
-

Description: 
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all the tasks following the common join 
are filtered out at runtime.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query (e.g., move stage which writes the results into the result 
table) are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.

  was:
The dependency graph among task stages is incorrect for the skewed join 
optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. For 
the case that skewed keys do not exist, all the tasks following the common join 
are filtered out at runtime.

In particular, the conditional task in the optimized plan maintains no 
dependency with the child tasks of the common join task in the original plan. 
The conditional task is composed of the map join task which maintains all these 
dependencies, but for the case the map join task is filtered out (i.e., no 
skewed keys exist), all these dependencies are lost. Hence, all the other task 
stages of the query are skipped.

The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
processSkewJoin() function, immediately after the ConditionalTask is created 
and its dependencies are set.


 Incorrect task dependency graph for skewed join optimization
 

 Key: HIVE-6041
 URL: https://issues.apache.org/jira/browse/HIVE-6041
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: Hadoop 1.0.3
Reporter: Adrian Popescu
Priority: Critical

 The dependency graph among task stages is incorrect for the skewed join 
 optimized plan. Skewed joins are enabled through hive.optimize.skewjoin. 
 For the case that skewed keys do not exist, all the tasks following the 
 common join are filtered out at runtime.
 In particular, the conditional task in the optimized plan maintains no 
 dependency with the child tasks of the common join task in the original plan. 
 The conditional task is composed of the map join task which maintains all 
 these dependencies, but for the case the map join task is filtered out (i.e., 
 no skewed keys exist), all these dependencies are lost. Hence, all the other 
 task stages of the query (e.g., move stage which writes the results into the 
 result table) are skipped.
 The bug resides in ql/optimizer/physical/GenMRSkewJoinProcessor.java, 
 processSkewJoin() function, immediately after the ConditionalTask is created 
 and its dependencies are set.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6042) With Dynamic partitioning, All partitions can not be overwrited

2013-12-17 Thread ruish li (JIRA)
ruish li created HIVE-6042:
--

 Summary: With Dynamic partitioning, All partitions can not be 
overwrited
 Key: HIVE-6042
 URL: https://issues.apache.org/jira/browse/HIVE-6042
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
 Environment: OS: Red Hat Enterprise Linux Server release 6.2
HDFS: CDH-4.2.1
MAPRED: CDH-4.2.1-mr1
Reporter: ruish li
Priority: Minor


step1: create table 
 drop table if exists t;
 create table t(a int)PARTITIONED BY (city_ string);
step2: insert data (table dual has only one value: ‘x’)
   set hive.exec.dynamic.partition.mode=nonstrict; 
   insert into table t partition(city_) select  1,'beijing'  from dual; 
   insert into table t partition(city_) select  2,'shanghai' from dual;

  hive (default)  select * from t;
1   beijing
2   shanghai

step3: overwrite table ,we can show that
 insert overwrite table t partition(city_) select 3,'beijing' from dual;
 hive (default)  select * from t;
1   beijing
2   shanghai

here we can see the partition city_=shanghai  exist yet,But we hope that this 
partition is covered With Dynamic partitioning.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6042) With Dynamic partitioning, All partitions can not be overwrited

2013-12-17 Thread ruish li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ruish li updated HIVE-6042:
---

Description: 
step1: create table 
 drop table if exists t;
 create table t(a int)PARTITIONED BY (city_ string);
step2: insert data (table dual has only one value: ‘x’)
   set hive.exec.dynamic.partition.mode=nonstrict; 
   insert into table t partition(city_) select  1,'beijing'  from dual; 
   insert into table t partition(city_) select  2,'shanghai' from dual;

  hive (default)  select * from t;
1   beijing
2   shanghai

step3: overwrite table 
 insert overwrite table t partition(city_) select 3,'beijing' from dual;
 hive (default)  select * from t;
1   beijing
2   shanghai

here we can see the partition city_=shanghai  exist yet,But we hope that this 
partition is covered With Dynamic partitioning.

  was:
step1: create table 
 drop table if exists t;
 create table t(a int)PARTITIONED BY (city_ string);
step2: insert data (table dual has only one value: ‘x’)
   set hive.exec.dynamic.partition.mode=nonstrict; 
   insert into table t partition(city_) select  1,'beijing'  from dual; 
   insert into table t partition(city_) select  2,'shanghai' from dual;

  hive (default)  select * from t;
1   beijing
2   shanghai

step3: overwrite table ,we can show that
 insert overwrite table t partition(city_) select 3,'beijing' from dual;
 hive (default)  select * from t;
1   beijing
2   shanghai

here we can see the partition city_=shanghai  exist yet,But we hope that this 
partition is covered With Dynamic partitioning.


 With Dynamic partitioning, All partitions can not be overwrited
 ---

 Key: HIVE-6042
 URL: https://issues.apache.org/jira/browse/HIVE-6042
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
 Environment: OS: Red Hat Enterprise Linux Server release 6.2
 HDFS: CDH-4.2.1
 MAPRED: CDH-4.2.1-mr1
Reporter: ruish li
Priority: Minor

 step1: create table 
  drop table if exists t;
  create table t(a int)PARTITIONED BY (city_ string);
 step2: insert data (table dual has only one value: ‘x’)
set hive.exec.dynamic.partition.mode=nonstrict; 
insert into table t partition(city_) select  1,'beijing'  from dual; 
insert into table t partition(city_) select  2,'shanghai' from dual;
   hive (default)  select * from t;
 1 beijing
 2 shanghai
 step3: overwrite table 
  insert overwrite table t partition(city_) select 3,'beijing' from dual;
  hive (default)  select * from t;
 1 beijing
 2 shanghai
 here we can see the partition city_=shanghai  exist yet,But we hope that this 
 partition is covered With Dynamic partitioning.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6043:
---

Summary: Document incompatible changes in Hive 0.12 and trunk  (was: 
Document incompatible changes)

 Document incompatible changes in Hive 0.12 and trunk
 

 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker

 We need to document incompatible changes. For example
 * HIVE-5372 changed object inspector hierarchy breaking most if not all 
 custom serdes
 * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
 serdes (fixed by HIVE-5380)
 * Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
 Serdes
 * HIVE-5411 serializes expressions with Kryo which are used by custom serdes



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6043) Document incompatible changes

2013-12-17 Thread Brock Noland (JIRA)
Brock Noland created HIVE-6043:
--

 Summary: Document incompatible changes
 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker


We need to document incompatible changes. For example

* HIVE-5372 changed object inspector hierarchy breaking most if not all custom 
serdes
* HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
serdes (fixed by HIVE-5380)
* Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
Serdes
* HIVE-5411 serializes expressions with Kryo which are used by custom serdes




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5380) Non-default OI constructors should be supported if for backwards compatibility

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5380:
---

Attachment: HIVE-5380.patch

[~xuefuz], can you take a look at this?

 Non-default OI constructors should be supported if for backwards compatibility
 --

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5380) Non-default OI constructors should be supported if for backwards compatibility

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850543#comment-13850543
 ] 

Brock Noland commented on HIVE-5380:


Uploaded new patch base don kyro-2.22.

 Non-default OI constructors should be supported if for backwards compatibility
 --

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Incompatible Changes affecting Serdes and UDFS

2013-12-17 Thread Brock Noland
Hi,

Hive 0.12 made some incompatible changes which impacts Serdes and it
appears 0.13 makes more incompatible changes. I created HIVE-6043 to track
this, if you know of any more changes than what is described there, please
do add them.

Thanks!
Brock


[jira] [Updated] (HIVE-6029) Add default authorization on database/table creation

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6029:
---

Status: Patch Available  (was: Open)

Submitting patch for testing.

 Add default authorization on database/table creation
 

 Key: HIVE-6029
 URL: https://issues.apache.org/jira/browse/HIVE-6029
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Metastore
Affects Versions: 0.10.0
Reporter: Chris Drome
Assignee: Chris Drome
Priority: Minor
 Attachments: HIVE-6029-1.patch.txt, HIVE-6029.2.patch


 Default authorization privileges are not set when a database/table is 
 created. This allows a user to create a database/table and not be able to 
 access it through Sentry.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6029) Add default authorization on database/table creation

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6029:
---

Attachment: HIVE-6029.2.patch

[~cdrome] I rebased the patch on trunk. How does it look?

 Add default authorization on database/table creation
 

 Key: HIVE-6029
 URL: https://issues.apache.org/jira/browse/HIVE-6029
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Metastore
Affects Versions: 0.10.0
Reporter: Chris Drome
Assignee: Chris Drome
Priority: Minor
 Attachments: HIVE-6029-1.patch.txt, HIVE-6029.2.patch


 Default authorization privileges are not set when a database/table is 
 created. This allows a user to create a database/table and not be able to 
 access it through Sentry.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850557#comment-13850557
 ] 

Brock Noland commented on HIVE-5783:


Thanks Resmus for creating HIVE-5998.

Eric, I think the current patch is stale since it's been decided the Parquet 
Serde will contributed to Hive.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Attachments: HIVE-5783.patch, hive-0.11-parquet.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5812) HiveServer2 SSL connection transport binds to loopback address by default

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5812:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Thank you for the contribution Prasad! I have committed this to trunk.

 HiveServer2 SSL connection transport binds to loopback address by default
 -

 Key: HIVE-5812
 URL: https://issues.apache.org/jira/browse/HIVE-5812
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.13.0

 Attachments: HIVE-5812.1.patch, HIVE-5812.2.patch


 The secure socket transport implemented as part of HIVE-5351, binds to 
 loopback address by default. If the bind interface gets used only if its 
 explicitly defined in the hive-site or via environment.
 This behavior should be same as non-SSL transport.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5928) Add a hive authorization plugin api that does not assume privileges needed

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850574#comment-13850574
 ] 

Brock Noland commented on HIVE-5928:


bq.  interface HiveBaseAuthorizationProvider
bq.  There will be a subclass of HiveBaseAuthorizationProvider 

Since it doesn't look like we have implemented here...may I interject some 
thoughts? I think we should start moving hive development from inheritance to 
composition where possible[1]. This looks like a great place to start.

[1] http://en.wikipedia.org/wiki/Composition_over_inheritance

 Add a hive authorization plugin api that does not assume privileges needed
 --

 Key: HIVE-5928
 URL: https://issues.apache.org/jira/browse/HIVE-5928
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
   Original Estimate: 120h
  Remaining Estimate: 120h

 The existing HiveAuthorizationProvider interface implementations can be used 
 to support custom authorization models.
 But this interface limits the customization for these reasons -
 1. It has assumptions about the privileges required for an action.
 2. It does have not functions that you can implement for having custom ways 
 of doing the actions of access control statements.
 This jira proposes a new interface HiveBaseAuthorizationProvider that does 
 not make assumptions of the privileges required for the actions. The 
 authorize() functions will be equivalent of authorize(hive object, 
 action). It will also have functions that will be called from the access 
 control statements.
 The current HiveAuthorizationProvider will continue to be supported for 
 backward compatibility. There will be a subclass of 
 HiveBaseAuthorizationProvider that executes actions using this interface.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-4887) hive should have an option to disable non sql commands that impose security risk

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850583#comment-13850583
 ] 

Brock Noland commented on HIVE-4887:


bq. It should be possible to to disable create function as well.

I would kindly suggest the following:

1) have a whitelist of UDFs which can be used when authorization is enabled as 
some UDFs are insecure by default - java_method() or transform().
2) Add a URI privilege where admin's can give users permission to vetted jars. 
Then when someone creates a UDF you can verify the class exists in a jar they 
privilege to access.

 hive should have an option to disable non sql commands that impose security 
 risk
 

 Key: HIVE-4887
 URL: https://issues.apache.org/jira/browse/HIVE-4887
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, Security
Reporter: Thejas M Nair
   Original Estimate: 72h
  Remaining Estimate: 72h

 Hive's RDBMS style of authorization (using grant/revoke), relies on all data 
 access being done through hive select queries. But hive also supports running 
 dfs commands, shell commands (eg !cat file), and shell commands through 
 hive streaming.
 This creates problems in securing a hive server using this authorization 
 model. UDF is another way to write custom code that can compromise security, 
 but you can control that by restricting access to users to be only through 
 jdbc connection to hive server (2).
 (note that there are other major problems such as this one - HIVE-3271)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850586#comment-13850586
 ] 

Brock Noland commented on HIVE-5837:


[~thejas],

as I mentioned 
[here|https://issues.apache.org/jira/browse/HIVE-4887?focusedCommentId=13850583page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13850583]
 I would consider adding a URI privilege to the model described here. This 
allows the use of custom UDFs for users. Beyond that I think a SERVER privilege 
should be added as well. The reason I believe a server privilege is useful is 
because large deployments of Hive would like to take advantage of multiple HS2 
instances while allowing users to only access a single instance. What are you 
thoughts on these topics?

 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5380) Non-default OI constructors should be supported if for backwards compatibility

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850627#comment-13850627
 ] 

Hive QA commented on HIVE-5380:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619113/HIVE-5380.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4789 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/667/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/667/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619113

 Non-default OI constructors should be supported if for backwards compatibility
 --

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Hive-trunk-h0.21 - Build # 2508 - Still Failing

2013-12-17 Thread Apache Jenkins Server
Changes for Build #2472
[navis] HIVE-4518 : Should be removed files (OptrStatsGroupByHook, etc.)

[navis] HIVE-5839 : BytesRefArrayWritable compareTo violates contract (Xuefu 
Zhang via Navis)

[navis] HIVE-4518 : Missing file (HiveFatalException)

[navis] HIVE-4518 : Counter Strike: Operation Operator (Gunther Hagleitner and 
Jason Dere via Navis)


Changes for Build #2473
[brock] HIVE-4741 - Add Hive config API to modify the restrict list (Prasad 
Mujumdar, Navis via Brock Noland)


Changes for Build #2474
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #2475
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #2476
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #2477

Changes for Build #2478

Changes for Build #2479

Changes for Build #2480
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #2481

Changes for Build #2482
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #2483
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #2484
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #2485
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #2486
[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #2487
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)

[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)


Changes for Build #2488
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #2489
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #2490

Changes for Build #2491

Changes for Build #2492
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #2493
[xuefu] HIVE-5872: 

Hive-trunk-hadoop2 - Build # 607 - Still Failing

2013-12-17 Thread Apache Jenkins Server
Changes for Build #571
[navis] HIVE-4518 : Should be removed files (OptrStatsGroupByHook, etc.)

[navis] HIVE-5839 : BytesRefArrayWritable compareTo violates contract (Xuefu 
Zhang via Navis)

[navis] HIVE-4518 : Missing file (HiveFatalException)

[navis] HIVE-4518 : Counter Strike: Operation Operator (Gunther Hagleitner and 
Jason Dere via Navis)


Changes for Build #572
[brock] HIVE-4741 - Add Hive config API to modify the restrict list (Prasad 
Mujumdar, Navis via Brock Noland)


Changes for Build #573
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #574
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #575
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #576

Changes for Build #577

Changes for Build #578

Changes for Build #579
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #580
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #581
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #582
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #583
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #584
[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)

[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #585
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)


Changes for Build #586
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #587
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #588

Changes for Build #589

Changes for Build #590
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report 

[jira] [Commented] (HIVE-6029) Add default authorization on database/table creation

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850685#comment-13850685
 ] 

Hive QA commented on HIVE-6029:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619117/HIVE-6029.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4789 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/668/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/668/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619117

 Add default authorization on database/table creation
 

 Key: HIVE-6029
 URL: https://issues.apache.org/jira/browse/HIVE-6029
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Metastore
Affects Versions: 0.10.0
Reporter: Chris Drome
Assignee: Chris Drome
Priority: Minor
 Attachments: HIVE-6029-1.patch.txt, HIVE-6029.2.patch


 Default authorization privileges are not set when a database/table is 
 created. This allows a user to create a database/table and not be able to 
 access it through Sentry.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5380) Non-default OI constructors should be supported for backwards compatibility

2013-12-17 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5380:
---

Summary: Non-default OI constructors should be supported for backwards 
compatibility  (was: Non-default OI constructors should be supported if for 
backwards compatibility)

 Non-default OI constructors should be supported for backwards compatibility
 ---

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5380) Non-default OI constructors should be supported for backwards compatibility

2013-12-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850731#comment-13850731
 ] 

Xuefu Zhang commented on HIVE-5380:
---

+1

 Non-default OI constructors should be supported for backwards compatibility
 ---

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6021) Problem in GroupByOperator for handling distinct aggrgations

2013-12-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850733#comment-13850733
 ] 

Xuefu Zhang commented on HIVE-6021:
---

+1, patch looks good to me.

 Problem in GroupByOperator for handling distinct aggrgations
 

 Key: HIVE-6021
 URL: https://issues.apache.org/jira/browse/HIVE-6021
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Sun Rui
Assignee: Sun Rui
 Attachments: HIVE-6021.1.patch, HIVE-6021.2.patch


 Use the following test case with HIVE 0.12:
 {code:sql}
 create table src(key int, value string);
 load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
 set hive.map.aggr=false; 
 select count(key),count(distinct value) from src group by key;
 {code}
 We will get an ArrayIndexOutOfBoundsException from GroupByOperator:
 {code}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 5 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:159)
   ... 10 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:281)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:152)
   ... 10 more
 {code}
 explain select count(key),count(distinct value) from src group by key;
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 src 
   TableScan
 alias: src
 Select Operator
   expressions:
 expr: key
 type: int
 expr: value
 type: string
   outputColumnNames: key, value
   Reduce Output Operator
 key expressions:
   expr: key
   type: int
   expr: value
   type: string
 sort order: ++
 Map-reduce partition columns:
   expr: key
   type: int
 tag: -1
   Reduce Operator Tree:
 Group By Operator
   aggregations:
 expr: count(KEY._col0)   // The parameter causes this problem
^^^
 expr: count(DISTINCT KEY._col1:0._col0)
   bucketGroup: false
   keys:
 expr: KEY._col0
 type: int
   mode: complete
   outputColumnNames: _col0, _col1, _col2
   Select Operator
 expressions:
   expr: _col1
   type: bigint
   expr: _col2
   type: bigint
 outputColumnNames: _col0, _col1
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 The root cause is within GroupByOperator.initializeOp(). The method forgets 
 to handle the case:
 For a query has distinct aggregations, there is an aggregation function has a 
 parameter which is a groupby key column but not distinct key column.
 {code}
 if (unionExprEval != null) {
   String[] names = parameters.get(j).getExprString().split(\\.);
   // parameters of the form : KEY.colx:t.coly
   if 

[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850867#comment-13850867
 ] 

Kostiantyn Kudriavtsev commented on HIVE-3454:
--

Hi there, when is this patch going to be applied to trunk? it seems it's enough 
important issue to be included

 Problem with CAST(BIGINT as TIMESTAMP)
 --

 Key: HIVE-3454
 URL: https://issues.apache.org/jira/browse/HIVE-3454
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Ryan Harris
  Labels: newbie, newdev, patch
 Attachments: HIVE-3454.1.patch.txt, HIVE-3454.patch


 Ran into an issue while working with timestamp conversion.
 CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current 
 time from the BIGINT returned by unix_timestamp()
 Instead, however, a 1970-01-16 timestamp is returned.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-4887) hive should have an option to disable non sql commands that impose security risk

2013-12-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850866#comment-13850866
 ] 

Thejas M Nair commented on HIVE-4887:
-

[~brocknoland] thanks for the suggestions. That makes sense.
Along with 'add jar' privilege for URI , another complimentary approach would 
be to support a concept of permanent (blessed) udfs, that an admin can add and 
would be pre-registered for all users.


 hive should have an option to disable non sql commands that impose security 
 risk
 

 Key: HIVE-4887
 URL: https://issues.apache.org/jira/browse/HIVE-4887
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, Security
Reporter: Thejas M Nair
   Original Estimate: 72h
  Remaining Estimate: 72h

 Hive's RDBMS style of authorization (using grant/revoke), relies on all data 
 access being done through hive select queries. But hive also supports running 
 dfs commands, shell commands (eg !cat file), and shell commands through 
 hive streaming.
 This creates problems in securing a hive server using this authorization 
 model. UDF is another way to write custom code that can compromise security, 
 but you can control that by restricting access to users to be only through 
 jdbc connection to hive server (2).
 (note that there are other major problems such as this one - HIVE-3271)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6044:


Status: Patch Available  (was: Open)

 webhcat should be able to return detailed serde information when show table 
 using fromat=extended
 ---

 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-6044.1.patch


 Now in webhcat, when using GET ddl/database/:db/table/:table and 
 format=extended, return value is based on query show table extended like. 
 However, this query doesn't contains serde info like line.delim and 
 filed.delim. In this case, user won't have enough information to 
 reconstruct the exact same table based on the information from the json file. 
 The descExtendedTable function in HcatDelegator should also return extra 
 fields from query desc extended tablename which contains fields sd, 
 retention, parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6047) Permanent UDFs in Hive

2013-12-17 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6047:


 Summary: Permanent UDFs in Hive
 Key: HIVE-6047
 URL: https://issues.apache.org/jira/browse/HIVE-6047
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere


Currently Hive only supports temporary UDFs which must be re-registered when 
starting up a Hive session. Provide some support to register permanent UDFs 
with Hive. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6046) add UDF for converting date time from one presentation to another

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850877#comment-13850877
 ] 

Kostiantyn Kudriavtsev commented on HIVE-6046:
--

just start working on that, your comments are welcomed 

 add  UDF for converting date time from one presentation to another
 --

 Key: HIVE-6046
 URL: https://issues.apache.org/jira/browse/HIVE-6046
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor

 it'd be nice to have function for converting datetime to different formats, 
 for example:
 format_date('2013-12-12 00:00:00.0', '-MM-dd HH:mm:ss.S', '/MM/dd')
 There are two signatures to facilitate further using:
 format_date(datetime, fromFormat, toFormat)
 format_date(timestamp, toFormat)
  



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5936) analyze command failing to collect stats with counter mechanism

2013-12-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5936:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 analyze command failing to collect stats with counter mechanism
 ---

 Key: HIVE-5936
 URL: https://issues.apache.org/jira/browse/HIVE-5936
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Navis
 Fix For: 0.13.0

 Attachments: HIVE-5936.1.patch.txt, HIVE-5936.10.patch.txt, 
 HIVE-5936.11.patch.txt, HIVE-5936.2.patch.txt, HIVE-5936.3.patch.txt, 
 HIVE-5936.4.patch.txt, HIVE-5936.5.patch.txt, HIVE-5936.6.patch.txt, 
 HIVE-5936.7.patch.txt, HIVE-5936.8.patch.txt, HIVE-5936.9.patch.txt


 With counter mechanism, MR job is successful, but StatsTask on client fails 
 with NPE.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850880#comment-13850880
 ] 

Kostiantyn Kudriavtsev commented on HIVE-6006:
--

patch is available,  Could please somebody put the patch on ReviewBoard? That's 
make it easier to look at by interested people

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2013-12-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850885#comment-13850885
 ] 

Thejas M Nair commented on HIVE-5837:
-

[~brocknoland] I assume you mean URI and SERVER as objects  (similar to table, 
views etc) on which privileges (eg, select , insert,..) can be granted. As you 
know, URI authorization is very essential (more than just helping with udf 
support), without that you cannot enforce access control (you can use 'create 
table' to read from any hdfs location). 
I see that SERVER object will also be useful, but not essential for a first 
version. Should we make one of the sql standard privileges available on SERVER 
object ?


 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostiantyn Kudriavtsev updated HIVE-6006:
-

Attachment: hive-6006.patch

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2013-12-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850887#comment-13850887
 ] 

Thejas M Nair commented on HIVE-5837:
-

[~brocknoland] Thanks for your feedback in the jiras for SQL standard auth !


 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostiantyn Kudriavtsev updated HIVE-6006:
-

Status: Patch Available  (was: Open)

hive-6006.patch has been attached

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6046) add UDF for converting date time from one presentation to another

2013-12-17 Thread Kostiantyn Kudriavtsev (JIRA)
Kostiantyn Kudriavtsev created HIVE-6046:


 Summary: add  UDF for converting date time from one presentation 
to another
 Key: HIVE-6046
 URL: https://issues.apache.org/jira/browse/HIVE-6046
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor


it'd be nice to have function for converting datetime to different formats, for 
example:
format_date('2013-12-12 00:00:00.0', '-MM-dd HH:mm:ss.S', '/MM/dd')
There are two signatures to facilitate further using:
format_date(datetime, fromFormat, toFormat)
format_date(timestamp, toFormat)
 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2013-12-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850915#comment-13850915
 ] 

Alan Gates commented on HIVE-5837:
--

Brock, could you give more details on the SERVER use case?  I've seen people 
use multiple instances of HS2 for HA/scaling, but never allocating some users 
to some instances and others to others.  What's the motivation for that?

 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Hive-trunk-h0.21 - Build # 2509 - Still Failing

2013-12-17 Thread Apache Jenkins Server
Changes for Build #2473
[brock] HIVE-4741 - Add Hive config API to modify the restrict list (Prasad 
Mujumdar, Navis via Brock Noland)


Changes for Build #2474
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #2475
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #2476
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #2477

Changes for Build #2478

Changes for Build #2479

Changes for Build #2480
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #2481

Changes for Build #2482
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #2483
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #2484
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #2485
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #2486
[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #2487
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)

[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)


Changes for Build #2488
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #2489
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #2490

Changes for Build #2491

Changes for Build #2492
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #2493
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani 

[jira] [Commented] (HIVE-6006) Add UDF to calculate distance between geographic coordinates

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850928#comment-13850928
 ] 

Hive QA commented on HIVE-6006:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619157/hive-6006.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4792 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_functions
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/669/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/669/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619157

 Add UDF to calculate distance between geographic coordinates
 

 Key: HIVE-6006
 URL: https://issues.apache.org/jira/browse/HIVE-6006
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.13.0
Reporter: Kostiantyn Kudriavtsev
Priority: Minor
 Fix For: 0.13.0

 Attachments: hive-6006.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 It would be nice to have Hive UDF to calculate distance between two points on 
 Earth. Haversine formula seems to be good enough to overcome this issue
 The next function is proposed:
 HaversineDistance(lat1, lon1, lat2, lon2) - calculate Harvesine Distance 
 between 2 points with coordinates (lat1, lon1) and (lat2, lon2)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Review Request 16328: HIVE-5992: Hive inconsistently converts timestamp in AVG and SUM UDAF's

2013-12-17 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16328/
---

Review request for hive and Prasad Mujumdar.


Bugs: HIVE-5992
https://issues.apache.org/jira/browse/HIVE-5992


Repository: hive-git


Description
---

The fix is to make the two UDAFs report convert timestamp to double in terms of 
seconds and the fraction of the second.
Test is added to cover the case.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java 41d5efd 
  ql/src/test/queries/clientpositive/timestamp_3.q e5a4345 
  ql/src/test/results/clientpositive/timestamp_3.q.out 8544307 

Diff: https://reviews.apache.org/r/16328/diff/


Testing
---

Unit test. New unit test. Regression suite.


Thanks,

Xuefu Zhang



[jira] [Commented] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850932#comment-13850932
 ] 

Hive QA commented on HIVE-6044:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619140/HIVE-6044.1.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/670/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/670/console

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
[INFO] 
[INFO] Building Hive HCatalog Server Extensions 0.13.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
hive-hcatalog-server-extensions ---
[INFO] Deleting 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ 
hive-hcatalog-server-extensions ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/src/main/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ 
hive-hcatalog-server-extensions ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-hcatalog-server-extensions ---
[INFO] Compiling 38 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: Some input files use unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ 
hive-hcatalog-server-extensions ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/src/test/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ 
hive-hcatalog-server-extensions ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/tmp/conf
 [copy] Copying 4 files to 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-hcatalog-server-extensions ---
[INFO] Compiling 4 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/test-classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ 
hive-hcatalog-server-extensions ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ 
hive-hcatalog-server-extensions ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-0.13.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-hcatalog-server-extensions ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/target/hive-hcatalog-server-extensions-0.13.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/0.13.0-SNAPSHOT/hive-hcatalog-server-extensions-0.13.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/server-extensions/pom.xml
 to 
/data/hive-ptest/working/maven/org/apache/hive/hcatalog/hive-hcatalog-server-extensions/0.13.0-SNAPSHOT/hive-hcatalog-server-extensions-0.13.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive HCatalog Webhcat Java Client 0.13.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- 

[jira] [Commented] (HIVE-6047) Permanent UDFs in Hive

2013-12-17 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850921#comment-13850921
 ] 

Eric Hanson commented on HIVE-6047:
---

Vectorized execution works with temporary UDFs through an adaptor. If you could 
verify that permanent UDFs added by users also work in vectorized mode with 
that adaptor, that'd be great.

 Permanent UDFs in Hive
 --

 Key: HIVE-6047
 URL: https://issues.apache.org/jira/browse/HIVE-6047
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere

 Currently Hive only supports temporary UDFs which must be re-registered when 
 starting up a Hive session. Provide some support to register permanent UDFs 
 with Hive. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk

2013-12-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850810#comment-13850810
 ] 

Sergey Shelukhin commented on HIVE-6043:


HIVE-4914? It does have backward compat

 Document incompatible changes in Hive 0.12 and trunk
 

 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker

 We need to document incompatible changes. For example
 * HIVE-5372 changed object inspector hierarchy breaking most if not all 
 custom serdes
 * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
 serdes (fixed by HIVE-5380)
 * Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
 Serdes
 * HIVE-5411 serializes expressions with Kryo which are used by custom serdes
 * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
 was introduced in Hive 0.11 by HIVE-3952).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Review Request 16329: HIVE-6039: Round, AVG and SUM functions reject char/varch input while accepting string input

2013-12-17 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16329/
---

Review request for hive and Prasad Mujumdar.


Bugs: HIVE-6039
https://issues.apache.org/jira/browse/HIVE-6039


Repository: hive-git


Description
---

Allow input to these UDFs for char and varchar.


Diffs
-

  data/files/char_varchar_udf.txt PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java 
4b219bd 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java 41d5efd 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java 
fc9c1b2 
  ql/src/test/queries/clientpositive/char_varchar_udf.q PRE-CREATION 
  ql/src/test/results/clientpositive/char_varchar_udf.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/16329/diff/


Testing
---

Unit tested. New test added. Test suite passed.


Thanks,

Xuefu Zhang



[jira] [Created] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Shuaishuai Nie (JIRA)
Shuaishuai Nie created HIVE-6044:


 Summary: webhcat should be able to return detailed serde 
information when show table using fromat=extended
 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie


Now in webhcat, when using GET ddl/database/:db/table/:table and 
format=extended, return value is based on query show table extended like. 
However, this query doesn't contains serde info like line.delim and 
filed.delim. In this case, user won't have enough information to reconstruct 
the exact same table based on the information from the json file. The 
descExtendedTable function in HcatDelegator should also return extra fields 
from query desc extended tablename which contains fields sd, retention, 
parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk

2013-12-17 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-6043:
---

Description: 
We need to document incompatible changes. For example

* HIVE-5372 changed object inspector hierarchy breaking most if not all custom 
serdes
* HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
serdes (fixed by HIVE-5380)
* Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
Serdes
* HIVE-5411 serializes expressions with Kryo which are used by custom serdes
* HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
was introduced in Hive 0.11 by HIVE-3952).


  was:
We need to document incompatible changes. For example

* HIVE-5372 changed object inspector hierarchy breaking most if not all custom 
serdes
* HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
serdes (fixed by HIVE-5380)
* Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
Serdes
* HIVE-5411 serializes expressions with Kryo which are used by custom serdes



 Document incompatible changes in Hive 0.12 and trunk
 

 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker

 We need to document incompatible changes. For example
 * HIVE-5372 changed object inspector hierarchy breaking most if not all 
 custom serdes
 * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
 serdes (fixed by HIVE-5380)
 * Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
 Serdes
 * HIVE-5411 serializes expressions with Kryo which are used by custom serdes
 * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
 was introduced in Hive 0.11 by HIVE-3952).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6043) Document incompatible changes in Hive 0.12 and trunk

2013-12-17 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850759#comment-13850759
 ] 

Yin Huai commented on HIVE-6043:


I added HIVE-4827, which removed the flag of hive.optimize.mapjoin.mapreduce.

 Document incompatible changes in Hive 0.12 and trunk
 

 Key: HIVE-6043
 URL: https://issues.apache.org/jira/browse/HIVE-6043
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Priority: Blocker

 We need to document incompatible changes. For example
 * HIVE-5372 changed object inspector hierarchy breaking most if not all 
 custom serdes
 * HIVE-1511/HIVE-5263 serializes ObjectInspectors with Kryo so all custom 
 serdes (fixed by HIVE-5380)
 * Hive 0.12 separates MapredWork into MapWork and ReduceWork which is used by 
 Serdes
 * HIVE-5411 serializes expressions with Kryo which are used by custom serdes
 * HIVE-4827 removed the flag of hive.optimize.mapjoin.mapreduce (This flag 
 was introduced in Hive 0.11 by HIVE-3952).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6044:
-

Attachment: HIVE-6044.1.patch

 webhcat should be able to return detailed serde information when show table 
 using fromat=extended
 ---

 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-6044.1.patch


 Now in webhcat, when using GET ddl/database/:db/table/:table and 
 format=extended, return value is based on query show table extended like. 
 However, this query doesn't contains serde info like line.delim and 
 filed.delim. In this case, user won't have enough information to 
 reconstruct the exact same table based on the information from the json file. 
 The descExtendedTable function in HcatDelegator should also return extra 
 fields from query desc extended tablename which contains fields sd, 
 retention, parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Review Request 16330: HIVE-6045- Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16330/
---

Review request for hive.


Bugs: HIVE-6045
https://issues.apache.org/jira/browse/HIVE-6045


Repository: hive-git


Description
---

The implementation appends hivevars to the jdbc url in the form 
var1=val1var2=val2$var3-val3

but the regex used to parse this is expecting the delimiter to be ;.  Changed 
the regex to fit the hivevar format.


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/Utils.java 913dc46 

Diff: https://reviews.apache.org/r/16330/diff/


Testing
---

Looks like TestBeelineWithArgs is no longer being run, and there are a lot of 
failures there due to other changes even without this change.  Probably we need 
to move that test, and see if we can add a unit test there for this case.


Thanks,

Szehon Ho



[jira] [Created] (HIVE-6045) Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-6045:
---

 Summary: Beeline hivevars is broken for more than one hivevar
 Key: HIVE-6045
 URL: https://issues.apache.org/jira/browse/HIVE-6045
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho


HIVE-4568 introduced --hivevar flag.  But if you specify more than one hivevar, 
for example 

{code}
beeline --hivevar file1=/user/szehon/file1 --hivevar file2=/user/szehon/file2
{code}

then the variables during runtime get mangled to evaluate to:

{code}
file1=/user/szehon/file1file2=/user/szehon/file2
{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6048) Hive load data command rejects file with '+' in the name

2013-12-17 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-6048:
-

 Summary: Hive load data command rejects file with '+' in the name
 Key: HIVE-6048
 URL: https://issues.apache.org/jira/browse/HIVE-6048
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


'+' is a valid character in a file name on linux and HDFS. However, loading 
data from such a file into table results the following error:

{code}
hive load data local inpath './t+est' into table test;
FAILED: SemanticException Line 1:23 Invalid path ''./t+est'': No files matching 
path file:/home/xzhang/apache/hive7/t%20est
{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6045) Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6045:


Attachment: HIVE-6045.patch

Attaching a fix.

 Beeline hivevars is broken for more than one hivevar
 

 Key: HIVE-6045
 URL: https://issues.apache.org/jira/browse/HIVE-6045
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6045.patch


 HIVE-4568 introduced --hivevar flag.  But if you specify more than one 
 hivevar, for example 
 {code}
 beeline --hivevar file1=/user/szehon/file1 --hivevar file2=/user/szehon/file2
 {code}
 then the variables during runtime get mangled to evaluate to:
 {code}
 file1=/user/szehon/file1file2=/user/szehon/file2
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6045) Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6045:


Status: Patch Available  (was: Open)

 Beeline hivevars is broken for more than one hivevar
 

 Key: HIVE-6045
 URL: https://issues.apache.org/jira/browse/HIVE-6045
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6045.patch


 HIVE-4568 introduced --hivevar flag.  But if you specify more than one 
 hivevar, for example 
 {code}
 beeline --hivevar file1=/user/szehon/file1 --hivevar file2=/user/szehon/file2
 {code}
 then the variables during runtime get mangled to evaluate to:
 {code}
 file1=/user/szehon/file1file2=/user/szehon/file2
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null

2013-12-17 Thread shanyu zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shanyu zhao updated HIVE-6035:
--

Assignee: shanyu zhao
  Status: Patch Available  (was: Open)

 Windows: percentComplete returned by job status from WebHCat is null
 

 Key: HIVE-6035
 URL: https://issues.apache.org/jira/browse/HIVE-6035
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 0.13.0

 Attachments: HIVE-6035.patch


 HIVE-5511 fixed the same problem on Linux, but it still broke on Windows.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 15654: Rewrite Trim and Pad UDFs based on GenericUDF

2013-12-17 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15654/#review30574
---



ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFLTrim.java
https://reviews.apache.org/r/15654/#comment58540

For these new tests please change the package to 
org.apache.hive.ql.udf.generic and move them to the directory 
src/test/org/apache/hadoop/hive/ql/udf/generic.


- Carl Steinbach


On Dec. 17, 2013, midnight, Mohammad Islam wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15654/
 ---
 
 (Updated Dec. 17, 2013, midnight)
 
 
 Review request for hive, Ashutosh Chauhan, Carl Steinbach, and Jitendra 
 Pandey.
 
 
 Bugs: HIVE-5829
 https://issues.apache.org/jira/browse/HIVE-5829
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Rewrite the UDFS *pads and *trim using GenericUDF.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java a895d65 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 bca1f26 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLTrim.java dc00cf9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLpad.java d1da19a 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRTrim.java 2bcc5fa 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRpad.java 9652ce2 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTrim.java 490886d 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBasePad.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseTrim.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLTrim.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLpad.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRTrim.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRpad.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrim.java 
 PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
  eff251f 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFLTrim.java 
 PRE-CREATION 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFLpad.java 
 PRE-CREATION 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFRTrim.java 
 PRE-CREATION 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFRpad.java 
 PRE-CREATION 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFTrim.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/15654/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Mohammad Islam
 




[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF

2013-12-17 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-5829:
-

Status: Open  (was: Patch Available)

[~kamrul] I noted one small issue on RB related to the package names of the new 
tests. Other than that I think the patch is ready to commit.

 Rewrite Trim and Pad UDFs based on GenericUDF
 -

 Key: HIVE-5829
 URL: https://issues.apache.org/jira/browse/HIVE-5829
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, tmp.HIVE-5829.patch


 This JIRA includes following UDFs:
 1. trim()
 2. ltrim()
 3. rtrim()
 4. lpad()
 5. rpad()



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF

2013-12-17 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-5829:
-

Component/s: UDF

 Rewrite Trim and Pad UDFs based on GenericUDF
 -

 Key: HIVE-5829
 URL: https://issues.apache.org/jira/browse/HIVE-5829
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, tmp.HIVE-5829.patch


 This JIRA includes following UDFs:
 1. trim()
 2. ltrim()
 3. rtrim()
 4. lpad()
 5. rpad()



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6029) Add default authorization on database/table creation

2013-12-17 Thread Chris Drome (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850977#comment-13850977
 ] 

Chris Drome commented on HIVE-6029:
---

[~brocknoland] the initial patch was only intended for informational purposes 
as requested by [~thejas]. There is much more clean-up to be done, so please do 
not consider this yet. I will try to look at your rebased patch in the next 
couple of days. Thanks for reviewing.

 Add default authorization on database/table creation
 

 Key: HIVE-6029
 URL: https://issues.apache.org/jira/browse/HIVE-6029
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Metastore
Affects Versions: 0.10.0
Reporter: Chris Drome
Assignee: Chris Drome
Priority: Minor
 Attachments: HIVE-6029-1.patch.txt, HIVE-6029.2.patch


 Default authorization privileges are not set when a database/table is 
 created. This allows a user to create a database/table and not be able to 
 access it through Sentry.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6049) Hive uses deprecated hadoop configuration in Hadoop 2.0

2013-12-17 Thread shanyu zhao (JIRA)
shanyu zhao created HIVE-6049:
-

 Summary: Hive uses deprecated hadoop configuration in Hadoop 2.0
 Key: HIVE-6049
 URL: https://issues.apache.org/jira/browse/HIVE-6049
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.12.0
Reporter: shanyu zhao


Running hive CLI on hadoop 2.0, you'll see deprecated configurations warnings 
like this:

13/12/14 01:00:51 INFO Configuration.deprecation: mapred.input.dir.recursive is
 deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
 13/12/14 01:00:52 INFO Configuration.deprecation: mapred.max.split.size is 
depre
 cated. Instead, use mapreduce.input.fileinputformat.split.maxsize
 13/12/14 01:00:52 INFO Configuration.deprecation: mapred.min.split.size is 
depre
 cated. Instead, use mapreduce.input.fileinputformat.split.minsize
 13/12/14 01:00:52 INFO Configuration.deprecation: 
mapred.min.split.size.per.rack
 is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.r
 ack
 13/12/14 01:00:52 INFO Configuration.deprecation: 
mapred.min.split.size.per.node
 is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.n
 ode
 13/12/14 01:00:52 INFO Configuration.deprecation: mapred.reduce.tasks is 
depreca
 ted. Instead, use mapreduce.job.reduces
 13/12/14 01:00:52 INFO Configuration.deprecation: 
mapred.reduce.tasks.speculativ
 e.execution is deprecated. Instead, use mapreduce.reduce.speculative




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Hive-trunk-hadoop2 - Build # 608 - Still Failing

2013-12-17 Thread Apache Jenkins Server
Changes for Build #572
[brock] HIVE-4741 - Add Hive config API to modify the restrict list (Prasad 
Mujumdar, Navis via Brock Noland)


Changes for Build #573
[navis] HIVE-5827 : Incorrect location of logs for failed tests (Vikram Dixit K 
and Szehon Ho via Navis)

[thejas] HIVE-4485 : beeline prints null as empty strings (Thejas Nair reviewed 
by Ashutosh Chauhan)

[brock] HIVE-5704 - A couple of generic UDFs are not in the right 
folder/package (Xuefu Zhang via Brock Noland)

[brock] HIVE-5706 - Move a few numeric UDFs to generic implementations (Xuefu 
Zhang via Brock Noland)

[hashutosh] HIVE-5817 : column name to index mapping in VectorizationContext is 
broken (Remus Rusanu, Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5876 : Split elimination in ORC breaks for partitioned tables 
(Prasanth J via Ashutosh Chauhan)

[hashutosh] HIVE-5886 : [Refactor] Remove unused class JobCloseFeedback 
(Ashutosh Chauhan via Thejas Nair)

[brock] HIVE-5894 - Fix minor PTest2 issues (Brock Noland)


Changes for Build #574
[brock] HIVE-5755 - Fix hadoop2 execution environment Milestone 1 (Vikram Dixit 
K via Brock Noland)


Changes for Build #575
[xuefu] HIVE-5893: hive-schema-0.13.0.mysql.sql contains reference to 
nonexistent column (Carl via Xuefu)

[xuefu] HIVE-5684: Serde support for char (Jason via Xuefu)


Changes for Build #576

Changes for Build #577

Changes for Build #578

Changes for Build #579
[brock] HIVE-5441 - Async query execution doesn't return resultset status 
(Prasad Mujumdar via Thejas M Nair)

[brock] HIVE-5880 - Rename HCatalog HBase Storage Handler artifact id (Brock 
Noland reviewed by Prasad Mujumdar)


Changes for Build #580
[ehans] HIVE-5581: Implement vectorized year/month/day... etc. for string 
arguments (Teddy Choi via Eric Hanson)


Changes for Build #581
[rhbutani] HIVE-5898 Make fetching of column statistics configurable (Prasanth 
Jayachandran via Harish Butani)


Changes for Build #582
[brock] HIVE-5880 - (Rename HCatalog HBase Storage Handler artifact id) breaks 
packaging (Xuefu Zhang via Brock Noland)


Changes for Build #583
[xuefu] HIVE-5866: Hive divide operator generates wrong results in certain 
cases (reviewed by Prasad)

[ehans] HIVE-5877: Implement vectorized support for IN as boolean-valued 
expression (Eric Hanson)


Changes for Build #584
[thejas] HIVE-5550 : Import fails for tables created with default text, 
sequence and orc file formats using HCatalog API (Sushanth Sowmyan via Thejas 
Nair)

[ehans] HIVE-5895: vectorization handles division by zero differently from 
normal execution (Sergey Shelukhin via Eric Hanson)

[hashutosh] HIVE-5938 : Remove apache.mina dependency for test (Navis via 
Ashutosh Chauhan)

[xuefu] HIVE-5912: Show partition command doesn't support db.table (Yu Zhao via 
Xuefu)

[brock] HIVE-5906 - TestGenericUDFPower should use delta to compare doubles 
(Szehon Ho via Brock Noland)

[brock] HIVE-5855 - Add deprecated methods back to ColumnProjectionUtils (Brock 
Noland reviewed by Navis)

[brock] HIVE-5915 - Shade Kryo dependency (Brock Noland reviewed by Ashutosh 
Chauhan)


Changes for Build #585
[hashutosh] HIVE-5916 : No need to aggregate statistics collected via counter 
mechanism (Ashutosh Chauhan via Navis)

[xuefu] HIVE-5947: Fix test failure in decimal_udf.q (reviewed by Brock)


Changes for Build #586
[hashutosh] HIVE-5935 : hive.query.string is not provided to FetchTask (Navis 
via Ashutosh Chauhan)

[navis] HIVE-3455 : ANSI CORR(X,Y) is incorrect (Maxim Bolotin via Navis)

[hashutosh] HIVE-5921 : Better heuristics for worst case statistics estimates 
for join, limit and filter operator (Prasanth J via Harish Butani)

[rhbutani] HIVE-5899 NPE during explain extended with char/varchar columns 
(Jason Dere via Harish Butani)


Changes for Build #587
[xuefu] HIVE-3181: getDatabaseMajor/Minor version does not return values 
(Szehon via Xuefu, reviewed by Navis)

[brock] HIVE-5641 - BeeLineOpts ignores Throwable (Brock Noland reviewed by 
Prasad and Thejas)

[hashutosh] HIVE-5909 : locate and instr throw 
java.nio.BufferUnderflowException when empty string as substring (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-5686 : partition column type validation doesn't quite work for 
dates (Sergey Shelukhin via Ashutosh Chauhan)

[hashutosh] HIVE-5887 : metastore direct sql doesn't work with oracle (Sergey 
Shelukhin via Ashutosh Chauhan)


Changes for Build #588

Changes for Build #589

Changes for Build #590
[brock] HIVE-5981 - Add hive-unit back to itests pom (Brock Noland reviewed by 
Prasad)


Changes for Build #591
[xuefu] HIVE-5872: Make UDAFs such as GenericUDAFSum report accurate 
precision/scale for decimal types (reviewed by Sergey Shelukhin)

[hashutosh] HIVE-5978 : Rollups not supported in vector mode. (Jitendra Nath 
Pandey via Ashutosh Chauhan)

[hashutosh] HIVE-5830 : SubQuery: Not In subqueries should check if subquery 
contains nulls in matching column (Harish Butani via Ashutosh Chauhan)

[hashutosh] HIVE-5598 : 

[jira] [Commented] (HIVE-6028) Partition predicate literals are not interpreted correctly.

2013-12-17 Thread Pala M Muthaia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850984#comment-13850984
 ] 

Pala M Muthaia commented on HIVE-6028:
--

Sergey, the same thing above works in hive 12, for a regular string column (as 
opposed to partition column). 

In any case, given the cost of fix vs severity, we will avoid depending on type 
coercion and use proper literals.

 Partition predicate literals are not interpreted correctly.
 ---

 Key: HIVE-6028
 URL: https://issues.apache.org/jira/browse/HIVE-6028
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Pala M Muthaia
 Attachments: Hive-6028-explain-plan.txt


 When parsing/analyzing query, hive treats partition predicate value as int 
 instead of string. This breaks down and leads to incorrect result when the 
 partition predicate value starts with int 0, e.g: hour=00, hour=05 etc.
 The following repro illustrates the bug:
 -- create test table and partition, populate with some data
 create table test_partition_pred(col1 int) partitioned by (hour STRING);
 insert into table test_partition_pred partition (hour=00) select 21 FROM  
 some_table limit 1;
 -- this query returns incorrect results, i.e. just empty set.
 select * from test_partition_pred where hour=00;
 OK
 -- this query returns correct result. Note predicate value is string literal
 select * from test_partition_pred where hour='00';
 OK
 2100
 explain plan illustrates how the query was interpreted. Particularly the 
 partition predicate is pushed down as regular filter clause, with hour=0 as 
 predicate. See attached explain plan file.
 Note:
 1. The type of the partition column is defined as string, not int.
 2. This is a regression in Hive 0.12. This used to work in Hive 0.11
 3. Not an issue when the partition value starts with integer other than 0, 
 e.g hour=10, hour=11 etc.
 4. As seen above, workaround is to use string literal hour='00' etc.
 This should not be too bad if in the failing case hive complains that 
 partition hour=0 is not found, or complains literal type doesn't match column 
 type. Instead hive silently pushes it down as filter clause, and query 
 succeeds with empty set as result.
 We found this out in our production tables partitioned by hour, only a few 
 days after it started occurring, when there were empty data sets for 
 partitions hour=00 to hour=09.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850993#comment-13850993
 ] 

Hive QA commented on HIVE-3454:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12588389/HIVE-3454.patch

{color:green}SUCCESS:{color} +1 4789 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/671/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12588389

 Problem with CAST(BIGINT as TIMESTAMP)
 --

 Key: HIVE-3454
 URL: https://issues.apache.org/jira/browse/HIVE-3454
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Ryan Harris
  Labels: newbie, newdev, patch
 Attachments: HIVE-3454.1.patch.txt, HIVE-3454.patch


 Ran into an issue while working with timestamp conversion.
 CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current 
 time from the BIGINT returned by unix_timestamp()
 Instead, however, a 1970-01-16 timestamp is returned.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6050) JDBC backward compatibility is broken

2013-12-17 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-6050:
---

 Summary: JDBC backward compatibility is broken
 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Szehon Ho


Connect from JDBC driver of Hive 0.12 (TProtocolVersion=v4) to HiveServer2 of 
Hive 0.10 (TProtocolVersion=v1), will return the following exception:

{noformat}
java.sql.SQLException: Could not establish connection to 
jdbc:hive2://hive-c5-mysql-1.ent.cloudera.com:1/default: Required field 
'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
at 
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at 
com.cloudera.itest.hiveserver.UnmanagedHiveServer.createConnection(UnmanagedHiveServer.java:73)
at 
com.cloudera.itest.AbstractTestWithStaticConfiguration.createConnection(AbstractTestWithStaticConfiguration.java:68)
at com.cloudera.itest.FirstTest.sanityConnectionTest(FirstTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:69)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:48)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:292)
at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:77)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:195)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: org.apache.thrift.TApplicationException: Required field 
'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
at 
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
... 37 more
{noformat}

On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
which doesn't seem to be backward-compatible.  Look at the  generated file 
'TOpenSessionReq.java', the method TOpenSessionReqStandardScheme.read().  The 
method will call 'TProtocolVersion.findValue() on the thrift protocol's bytes, 
which returns null if the client is sending an enum value unknown to the 
server.  Then struct.validate() at the end of the method will fail because 
protocol version is null.  So doesn't look like the current 
backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6050) JDBC backward compatibility is broken

2013-12-17 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6050:


Description: 
Connect from JDBC driver of Hive 0.12 (TProtocolVersion=v4) to HiveServer2 of 
Hive 0.10 (TProtocolVersion=v1), will return the following exception:

{noformat}
java.sql.SQLException: Could not establish connection to 
jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
unset! Struct:TOpenSessionReq(client_protocol:null)
at 
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:571)
at java.sql.DriverManager.getConnection(DriverManager.java:187)
at 
com.cloudera.itest.hiveserver.UnmanagedHiveServer.createConnection(UnmanagedHiveServer.java:73)
at 
com.cloudera.itest.AbstractTestWithStaticConfiguration.createConnection(AbstractTestWithStaticConfiguration.java:68)
at com.cloudera.itest.FirstTest.sanityConnectionTest(FirstTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:69)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:48)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:292)
at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:77)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:195)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: org.apache.thrift.TApplicationException: Required field 
'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
at 
org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
at 
org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
... 37 more
{noformat}

On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
which doesn't seem to be backward-compatible.  Look at the code path in the 
generated file 'TOpenSessionReq.java', method 
TOpenSessionReqStandardScheme.read():

1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's 
byte stream, which returns null if the client is sending an enum value unknown 
to the server.  (v4 is unknown to server)
2. The method will then call struct.validate(), which will throw the above 
exception because of null version.  

So doesn't look like the current backward-compatibility scheme will work.

  was:
Connect from JDBC driver of Hive 0.12 (TProtocolVersion=v4) to HiveServer2 of 
Hive 0.10 (TProtocolVersion=v1), will return the following exception:

{noformat}
java.sql.SQLException: 

[jira] [Assigned] (HIVE-6028) Partition predicate literals are not interpreted correctly.

2013-12-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-6028:
--

Assignee: Sergey Shelukhin

 Partition predicate literals are not interpreted correctly.
 ---

 Key: HIVE-6028
 URL: https://issues.apache.org/jira/browse/HIVE-6028
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Pala M Muthaia
Assignee: Sergey Shelukhin
 Attachments: Hive-6028-explain-plan.txt


 When parsing/analyzing query, hive treats partition predicate value as int 
 instead of string. This breaks down and leads to incorrect result when the 
 partition predicate value starts with int 0, e.g: hour=00, hour=05 etc.
 The following repro illustrates the bug:
 -- create test table and partition, populate with some data
 create table test_partition_pred(col1 int) partitioned by (hour STRING);
 insert into table test_partition_pred partition (hour=00) select 21 FROM  
 some_table limit 1;
 -- this query returns incorrect results, i.e. just empty set.
 select * from test_partition_pred where hour=00;
 OK
 -- this query returns correct result. Note predicate value is string literal
 select * from test_partition_pred where hour='00';
 OK
 2100
 explain plan illustrates how the query was interpreted. Particularly the 
 partition predicate is pushed down as regular filter clause, with hour=0 as 
 predicate. See attached explain plan file.
 Note:
 1. The type of the partition column is defined as string, not int.
 2. This is a regression in Hive 0.12. This used to work in Hive 0.11
 3. Not an issue when the partition value starts with integer other than 0, 
 e.g hour=10, hour=11 etc.
 4. As seen above, workaround is to use string literal hour='00' etc.
 This should not be too bad if in the failing case hive complains that 
 partition hour=0 is not found, or complains literal type doesn't match column 
 type. Instead hive silently pushes it down as filter clause, and query 
 succeeds with empty set as result.
 We found this out in our production tables partitioned by hour, only a few 
 days after it started occurring, when there were empty data sets for 
 partitions hour=00 to hour=09.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6028) Partition predicate literals are not interpreted correctly.

2013-12-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851006#comment-13851006
 ] 

Sergey Shelukhin commented on HIVE-6028:


Yeah, I agree that this is breakage in 12 compared to 11. Sorry for that. Good 
to know that the workaround works.

I will resolve as dup of 4914, as the fix is contained therein.

 Partition predicate literals are not interpreted correctly.
 ---

 Key: HIVE-6028
 URL: https://issues.apache.org/jira/browse/HIVE-6028
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Pala M Muthaia
 Attachments: Hive-6028-explain-plan.txt


 When parsing/analyzing query, hive treats partition predicate value as int 
 instead of string. This breaks down and leads to incorrect result when the 
 partition predicate value starts with int 0, e.g: hour=00, hour=05 etc.
 The following repro illustrates the bug:
 -- create test table and partition, populate with some data
 create table test_partition_pred(col1 int) partitioned by (hour STRING);
 insert into table test_partition_pred partition (hour=00) select 21 FROM  
 some_table limit 1;
 -- this query returns incorrect results, i.e. just empty set.
 select * from test_partition_pred where hour=00;
 OK
 -- this query returns correct result. Note predicate value is string literal
 select * from test_partition_pred where hour='00';
 OK
 2100
 explain plan illustrates how the query was interpreted. Particularly the 
 partition predicate is pushed down as regular filter clause, with hour=0 as 
 predicate. See attached explain plan file.
 Note:
 1. The type of the partition column is defined as string, not int.
 2. This is a regression in Hive 0.12. This used to work in Hive 0.11
 3. Not an issue when the partition value starts with integer other than 0, 
 e.g hour=10, hour=11 etc.
 4. As seen above, workaround is to use string literal hour='00' etc.
 This should not be too bad if in the failing case hive complains that 
 partition hour=0 is not found, or complains literal type doesn't match column 
 type. Instead hive silently pushes it down as filter clause, and query 
 succeeds with empty set as result.
 We found this out in our production tables partitioned by hour, only a few 
 days after it started occurring, when there were empty data sets for 
 partitions hour=00 to hour=09.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5891) Alias conflict when merging multiple mapjoin tasks into their common child mapred task

2013-12-17 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851007#comment-13851007
 ] 

Yin Huai commented on HIVE-5891:


[~sunrui] Sorry for getting back late.

I just took a look at QB. Seems it uses aliasToSubq to store the mapping from 
aliases to sub query expressions (QBExpr). Then, a QBExpr also stores a QB 
which represents the subquery QB. With this recursive way, all QBs for 
different levels of the query are stored. So, parseCtx.getQB() only gets the 
main query block and its id is null. I am not sure if we can get the right QB 
(the QB for a subquery) from GenMapRedUtils.splitTasks... Can you take a quick 
look to see if it is easy to get the correct QB? If so, we can use the id of a 
QB to replace INTNAME. If not, let's use joinTree.getId for those 
JoinOperators. Seems we do not need to take special care to DemuxOperator. Can 
you create a review request for your patch? I can leave comments on the review 
board.

Also, since QBJoinTree.getJoinStreamDesc is not used, let's delete it.

 Alias conflict when merging multiple mapjoin tasks into their common child 
 mapred task
 --

 Key: HIVE-5891
 URL: https://issues.apache.org/jira/browse/HIVE-5891
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Sun Rui
Assignee: Sun Rui
 Attachments: HIVE-5891.1.patch


 Use the following test case with HIVE 0.12:
 {quote}
 create table src(key int, value string);
 load data local inpath 'src/data/files/kv1.txt' overwrite into table src;
 select * from (
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
   union all
   select c.key from
 (select a.key from src a join src b on a.key=b.key group by a.key) tmp
 join src c on tmp.key=c.key
 ) x;
 {quote}
 We will get a NullPointerException from Union Operator:
 {quote}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row {_col0:0}
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {_col0:0}
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:544)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
   ... 4 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.UnionOperator.processOp(UnionOperator.java:120)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:652)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:655)
   at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:220)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
   ... 5 more
 {quote}
   
 The root cause is in 
 CommonJoinTaskDispatcher.mergeMapJoinTaskIntoItsChildMapRedTask().
   +--+  +--+
   | MapJoin task |  | MapJoin task |
   +--+  +--+
  \ /
   \   /
  +--+
  |  Union task  |
  +--+
  
 CommonJoinTaskDispatcher merges the two MapJoin tasks into their common 
 child: Union task. The two MapJoin tasks have the same alias name for their 
 big 

[jira] [Resolved] (HIVE-6028) Partition predicate literals are not interpreted correctly.

2013-12-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-6028.


   Resolution: Duplicate
Fix Version/s: 0.13.0

 Partition predicate literals are not interpreted correctly.
 ---

 Key: HIVE-6028
 URL: https://issues.apache.org/jira/browse/HIVE-6028
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Pala M Muthaia
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: Hive-6028-explain-plan.txt


 When parsing/analyzing query, hive treats partition predicate value as int 
 instead of string. This breaks down and leads to incorrect result when the 
 partition predicate value starts with int 0, e.g: hour=00, hour=05 etc.
 The following repro illustrates the bug:
 -- create test table and partition, populate with some data
 create table test_partition_pred(col1 int) partitioned by (hour STRING);
 insert into table test_partition_pred partition (hour=00) select 21 FROM  
 some_table limit 1;
 -- this query returns incorrect results, i.e. just empty set.
 select * from test_partition_pred where hour=00;
 OK
 -- this query returns correct result. Note predicate value is string literal
 select * from test_partition_pred where hour='00';
 OK
 2100
 explain plan illustrates how the query was interpreted. Particularly the 
 partition predicate is pushed down as regular filter clause, with hour=0 as 
 predicate. See attached explain plan file.
 Note:
 1. The type of the partition column is defined as string, not int.
 2. This is a regression in Hive 0.12. This used to work in Hive 0.11
 3. Not an issue when the partition value starts with integer other than 0, 
 e.g hour=10, hour=11 etc.
 4. As seen above, workaround is to use string literal hour='00' etc.
 This should not be too bad if in the failing case hive complains that 
 partition hour=0 is not found, or complains literal type doesn't match column 
 type. Instead hive silently pushes it down as filter clause, and query 
 succeeds with empty set as result.
 We found this out in our production tables partitioned by hour, only a few 
 days after it started occurring, when there were empty data sets for 
 partitions hour=00 to hour=09.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken

2013-12-17 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851013#comment-13851013
 ] 

Szehon Ho commented on HIVE-6050:
-

[~ashutoshc] [~cwsteinbach] Do you guys have any thoughts/experiences on this 
issue?

It seems like we would need to change client protocol version to use another 
data type, to get this to work.  My thought was this should be ok, as 
backward-compatibility seem to be broken today anyway based on this analysis.

 JDBC backward compatibility is broken
 -

 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Szehon Ho

 Connect from JDBC driver of Hive 0.12 (TProtocolVersion=v4) to HiveServer2 of 
 Hive 0.10 (TProtocolVersion=v1), will return the following exception:
 {noformat}
 java.sql.SQLException: Could not establish connection to 
 jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
 unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
   at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at java.sql.DriverManager.getConnection(DriverManager.java:571)
   at java.sql.DriverManager.getConnection(DriverManager.java:187)
   at 
 com.cloudera.itest.hiveserver.UnmanagedHiveServer.createConnection(UnmanagedHiveServer.java:73)
   at 
 com.cloudera.itest.AbstractTestWithStaticConfiguration.createConnection(AbstractTestWithStaticConfiguration.java:68)
   at com.cloudera.itest.FirstTest.sanityConnectionTest(FirstTest.java:19)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:69)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:48)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
   at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:292)
   at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
   at 
 com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:77)
   at 
 com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:195)
   at 
 com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:63)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
 Caused by: org.apache.thrift.TApplicationException: Required field 
 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
   ... 37 more
 {noformat}
 On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
 which doesn't seem to be backward-compatible.  Look at the code path in the 
 generated file 

[jira] [Commented] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF

2013-12-17 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851016#comment-13851016
 ] 

Eric Hanson commented on HIVE-5829:
---

Looks good to me from the point of view of vectorization -- trim/ltrim/rtrim 
still vectorize.

 Rewrite Trim and Pad UDFs based on GenericUDF
 -

 Key: HIVE-5829
 URL: https://issues.apache.org/jira/browse/HIVE-5829
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-5829.1.patch, HIVE-5829.2.patch, tmp.HIVE-5829.patch


 This JIRA includes following UDFs:
 1. trim()
 2. ltrim()
 3. rtrim()
 4. lpad()
 5. rpad()



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16229: HIVE-6010 create a test that would ensure vectorization produces same results as non-vectorized execution

2013-12-17 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16229/
---

(Updated Dec. 17, 2013, 10:40 p.m.)


Review request for hive and Jitendra Pandey.


Bugs: HIVE-6010
https://issues.apache.org/jira/browse/HIVE-6010


Repository: hive-git


Description
---

See jira.


Diffs (updated)
-

  ant/src/org/apache/hadoop/hive/ant/QTestGenTask.java 79840c9 
  itests/qtest/pom.xml 971c5d3 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 275e3d7 
  ql/src/test/queries/clientcompare/vectorized_math_funcs.q PRE-CREATION 
  ql/src/test/queries/clientcompare/vectorized_math_funcs_00.qv PRE-CREATION 
  ql/src/test/queries/clientcompare/vectorized_math_funcs_01.qv PRE-CREATION 
  ql/src/test/templates/TestCompareCliDriver.vm PRE-CREATION 

Diff: https://reviews.apache.org/r/16229/diff/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Updated] (HIVE-6010) create a test that would ensure vectorization produces same results as non-vectorized execution

2013-12-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6010:
---

Attachment: HIVE-6010.03.patch

Now that logarithms are fixed I can add them to the test. Trivial update, 
should not affect +1 as long as the test passes :)

 create a test that would ensure vectorization produces same results as 
 non-vectorized execution
 ---

 Key: HIVE-6010
 URL: https://issues.apache.org/jira/browse/HIVE-6010
 Project: Hive
  Issue Type: Test
  Components: Tests, Vectorization
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6010.01.patch, HIVE-6010.02.patch, 
 HIVE-6010.03.patch, HIVE-6010.patch


 So as to ensure that vectorization is not forgotten when changes are made to 
 things. Obviously it would not be viable to have a bulletproof test, but at 
 least a subset of operations can be verified.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6044:
-

Attachment: HIVE-6044.1.patch

 webhcat should be able to return detailed serde information when show table 
 using fromat=extended
 ---

 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-6044.1.patch


 Now in webhcat, when using GET ddl/database/:db/table/:table and 
 format=extended, return value is based on query show table extended like. 
 However, this query doesn't contains serde info like line.delim and 
 filed.delim. In this case, user won't have enough information to 
 reconstruct the exact same table based on the information from the json file. 
 The descExtendedTable function in HcatDelegator should also return extra 
 fields from query desc extended tablename which contains fields sd, 
 retention, parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Shuaishuai Nie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuaishuai Nie updated HIVE-6044:
-

Attachment: (was: HIVE-6044.1.patch)

 webhcat should be able to return detailed serde information when show table 
 using fromat=extended
 ---

 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-6044.1.patch


 Now in webhcat, when using GET ddl/database/:db/table/:table and 
 format=extended, return value is based on query show table extended like. 
 However, this query doesn't contains serde info like line.delim and 
 filed.delim. In this case, user won't have enough information to 
 reconstruct the exact same table based on the information from the json file. 
 The descExtendedTable function in HcatDelegator should also return extra 
 fields from query desc extended tablename which contains fields sd, 
 retention, parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6045) Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851066#comment-13851066
 ] 

Hive QA commented on HIVE-6045:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619170/HIVE-6045.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4789 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcDriver2.testNewConnectionConfiguration
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/672/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/672/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619170

 Beeline hivevars is broken for more than one hivevar
 

 Key: HIVE-6045
 URL: https://issues.apache.org/jira/browse/HIVE-6045
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6045.patch


 HIVE-4568 introduced --hivevar flag.  But if you specify more than one 
 hivevar, for example 
 {code}
 beeline --hivevar file1=/user/szehon/file1 --hivevar file2=/user/szehon/file2
 {code}
 then the variables during runtime get mangled to evaluate to:
 {code}
 file1=/user/szehon/file1file2=/user/szehon/file2
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6035) Windows: percentComplete returned by job status from WebHCat is null

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851067#comment-13851067
 ] 

Hive QA commented on HIVE-6035:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12618722/HIVE-6035.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/673/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/673/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-673/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'jdbc/src/java/org/apache/hive/jdbc/Utils.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/assembly/target shims/0.20S/target shims/0.23/target shims/common/target 
shims/common-secure/target packaging/target hbase-handler/target 
testutils/target jdbc/target metastore/target itests/target 
itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target 
itests/hive-unit/target itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
service/target contrib/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1551750.

At revision 1551750.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12618722

 Windows: percentComplete returned by job status from WebHCat is null
 

 Key: HIVE-6035
 URL: https://issues.apache.org/jira/browse/HIVE-6035
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 0.13.0

 Attachments: HIVE-6035.patch


 HIVE-5511 fixed the same problem on Linux, but it still broke on Windows.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6010) create a test that would ensure vectorization produces same results as non-vectorized execution

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851073#comment-13851073
 ] 

Hive QA commented on HIVE-6010:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619183/HIVE-6010.03.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/674/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/674/console

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/itests 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-it ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/target/tmp/conf
 [copy] Copying 4 files to 
/data/hive-ptest/working/apache-svn-trunk-source/itests/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-it ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/itests/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it/0.13.0-SNAPSHOT/hive-it-0.13.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Integration - Custom Serde 0.13.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-custom-serde 
---
[INFO] Deleting 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde (includes 
= [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ 
hive-it-custom-serde ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/src/main/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ 
hive-it-custom-serde ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-it-custom-serde ---
[INFO] Compiling 8 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ 
hive-it-custom-serde ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/src/test/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-custom-serde 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/tmp/conf
 [copy] Copying 4 files to 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-it-custom-serde ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ 
hive-it-custom-serde ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-custom-serde ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/hive-it-custom-serde-0.13.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-it-custom-serde ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/target/hive-it-custom-serde-0.13.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-serde/0.13.0-SNAPSHOT/hive-it-custom-serde-0.13.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/itests/custom-serde/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-serde/0.13.0-SNAPSHOT/hive-it-custom-serde-0.13.0-SNAPSHOT.pom
[INFO] 

[jira] [Commented] (HIVE-5837) SQL standard based secure authorization for hive

2013-12-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851079#comment-13851079
 ] 

Brock Noland commented on HIVE-5837:


bq. Should we make one of the sql standard privileges available on SERVER 
object?

Privileges on the SERVER object can make sense but I feel the more important 
aspect is to ensure privileges are scoped to a SERVER for the reason I will 
outline below.

bq. Brock, could you give more details on the SERVER use case? I've seen people 
use multiple instances of HS2 for HA/scaling, but never allocating some users 
to some instances and others to others. What's the motivation for that?

It's a very similar use case to federation. Enterprises often want to isolate 
groups of users from using the same resource. The scenario is you have group A 
and group B and they cannot or do not want to share the same HS2. By having 
server in the hierarchy you can enforce the separation amongst HS2 instances.

 SQL standard based secure authorization for hive
 

 Key: HIVE-5837
 URL: https://issues.apache.org/jira/browse/HIVE-5837
 Project: Hive
  Issue Type: New Feature
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: SQL standard authorization hive.pdf


 The current default authorization is incomplete and not secure. The 
 alternative of storage based authorization provides security but does not 
 provide fine grained authorization.
 The proposal is to support secure fine grained authorization in hive using 
 SQL standard based authorization model.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal

2013-12-17 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6051:
-

 Summary: Create DecimalColumnVector and a representative 
VectorExpression for decimal
 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson


Create a DecimalColumnVector to use as a basis for vectorized decimal 
operations. Include a representative VectorExpression on decimal (e.g. 
column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6010) create a test that would ensure vectorization produces same results as non-vectorized execution

2013-12-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6010:
---

Attachment: HIVE-6010.04.patch

import was removed by some other patch, rebase

 create a test that would ensure vectorization produces same results as 
 non-vectorized execution
 ---

 Key: HIVE-6010
 URL: https://issues.apache.org/jira/browse/HIVE-6010
 Project: Hive
  Issue Type: Test
  Components: Tests, Vectorization
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6010.01.patch, HIVE-6010.02.patch, 
 HIVE-6010.03.patch, HIVE-6010.04.patch, HIVE-6010.patch


 So as to ensure that vectorization is not forgotten when changes are made to 
 things. Obviously it would not be viable to have a bulletproof test, but at 
 least a subset of operations can be verified.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-6051) Create DecimalColumnVector and a representative VectorExpression for decimal

2013-12-17 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6051:
--

Attachment: HIVE-6051.01.patch

 Create DecimalColumnVector and a representative VectorExpression for decimal
 

 Key: HIVE-6051
 URL: https://issues.apache.org/jira/browse/HIVE-6051
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6051.01.patch


 Create a DecimalColumnVector to use as a basis for vectorized decimal 
 operations. Include a representative VectorExpression on decimal (e.g. 
 column-column addition) to demonstrate it's use.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5065) Create proper (i.e.: non .q file based) junit tests for DagUtils and TezTask

2013-12-17 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851095#comment-13851095
 ] 

Gunther Hagleitner commented on HIVE-5065:
--

part 2 add some .q file tests. This is necessary to round out the n .q file 
tests (some integration testing is necessary).

 Create proper (i.e.: non .q file based) junit tests for DagUtils and TezTask
 

 Key: HIVE-5065
 URL: https://issues.apache.org/jira/browse/HIVE-5065
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Blocker
 Fix For: tez-branch

 Attachments: HIVE-5065-part-1.1.patch, HIVE-5065-part2.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HIVE-5065) Create proper (i.e.: non .q file based) junit tests for DagUtils and TezTask

2013-12-17 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-5065:
-

Attachment: HIVE-5065-part2.1.patch

 Create proper (i.e.: non .q file based) junit tests for DagUtils and TezTask
 

 Key: HIVE-5065
 URL: https://issues.apache.org/jira/browse/HIVE-5065
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Blocker
 Fix For: tez-branch

 Attachments: HIVE-5065-part-1.1.patch, HIVE-5065-part2.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5762) Implement vectorized support for the DECIMAL data type

2013-12-17 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851096#comment-13851096
 ] 

Eric Hanson commented on HIVE-5762:
---

See HIVE-6051 for column vector code based on Decimal128.

 Implement vectorized support for the DECIMAL data type
 --

 Key: HIVE-5762
 URL: https://issues.apache.org/jira/browse/HIVE-5762
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson

 Add support to allow queries referencing DECIMAL columns and expression 
 results to run efficiently in vectorized mode.  Include unit tests and 
 end-to-end tests. 
 Before starting or at least going very far, please write design specification 
 (a new section for the design spec attached to HIVE-4160) for how support for 
 the different DECIMAL types should work in vectorized mode, and the roadmap, 
 and have it reviewed. 
 It may be feasible to re-use LongColumnVector and related VectorExpression 
 classes for fixed-point decimal in certain data ranges. That should be at 
 least considered to get faster performance and save code. For unlimited 
 precision DECIMAL, a new column vector subtype may be needed, or a 
 BytesColumnVector could be re-used.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5911) Recent change to schema upgrade scripts breaks file naming conventions

2013-12-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851116#comment-13851116
 ] 

Sergey Shelukhin commented on HIVE-5911:


ping? :)

 Recent change to schema upgrade scripts breaks file naming conventions
 --

 Key: HIVE-5911
 URL: https://issues.apache.org/jira/browse/HIVE-5911
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Sergey Shelukhin
 Attachments: HIVE-5911.01.patch, HIVE-5911.patch


 The changes made in HIVE-5700 break the convention for naming schema upgrade 
 scripts.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: adding ANSI flag for hive

2013-12-17 Thread Sergey Shelukhin
Agree on both points. For now, what I had in mind was double vs decimal,
and other such backward compat vs SQL compat and potentially perf vs SQL
compact cases.


I think one flag would not be so bad...


On Mon, Dec 16, 2013 at 8:29 AM, Alan Gates ga...@hortonworks.com wrote:

 A couple of thoughts on this:

 1) If we did this I think we should have one flag, not many.  As Thejas
 points out, your test matrix goes insane when you have too many flags and
 hence things don't get properly tested.

 2) We could do this in an incremental way, where we create this new ANSI
 flag and are clear with users that for a while this will be evolving.  That
 is, as we find new issues with data types, semantics, whatever, we will
 continue to change the behavior of this flag.  At some point in the future
 (as Thejas suggests, at a 1.0 release) we could make this the default
 behavior.  This avoids having to do a full sweep now and find everything
 that we want to change and make ANSI compliant and living with whatever we
 miss.

 Alan.

 On Dec 11, 2013, at 5:14 PM, Thejas Nair wrote:

  Having too many configs complicates things for the user, and also
  complicates the code, and you also end up having many untested
  combinations of config flags.
  I think we should identify a bunch of non compatible changes that we
  think are important, fix it in a branch and make a major version
  release (say 1.x).
 
  This is also related to HIVE-5875, where there is a discussion on
  switching the defaults for some of the configs to more desirable
  values, but non backward compatible values.
 
  On Wed, Dec 11, 2013 at 4:33 PM, Sergey Shelukhin
  ser...@hortonworks.com wrote:
  Hi.
 
  There's recently been some discussion about data type changes in Hive
  (double to decimal), and result changes for special cases like division
 by
  zero, etc., to bring it in compliance with MySQL (that's what JIRAs use
 an
  example; I am assuming ANSI SQL is meant).
  The latter are non-controversial (I guess), but for the former,
 performance
  may suffer and/or backward compat may be broken if Hive is brought in
  compliance.
  If fuller ANSI compat is sought in the future, there may be some even
  hairier issues such as double-quoted identifiers.
 
  In light of that, and also following MySQL, I wonder if we should add a
  flag, or set of flags, to HIVE to be able to force ANSI compliance.
  When this/ese flag/s is/are not set, for example, int/int division could
  return double for backward compat/perf, vectorization can skip the
 special
  case handling for division by zero/etc., etc.
  Wdyt?
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or
 entity to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
 to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-5761) Implement vectorized support for the DATE data type

2013-12-17 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851133#comment-13851133
 ] 

Eric Hanson commented on HIVE-5761:
---

Hi Teddy,

Are you going to work on this anytime soon? Please let me know one way or the 
other.

Thanks!
Eric

 Implement vectorized support for the DATE data type
 ---

 Key: HIVE-5761
 URL: https://issues.apache.org/jira/browse/HIVE-5761
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi

 Add support to allow queries referencing DATE columns and expression results 
 to run efficiently in vectorized mode. This should re-use the code for the 
 the integer/timestamp types to the extent possible and beneficial. Include 
 unit tests and end-to-end tests. Consider re-using or extending existing 
 end-to-end tests for vectorized integer and/or timestamp operations.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-6044) webhcat should be able to return detailed serde information when show table using fromat=extended

2013-12-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851137#comment-13851137
 ] 

Hive QA commented on HIVE-6044:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12619185/HIVE-6044.1.patch

{color:green}SUCCESS:{color} +1 4789 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/675/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/675/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12619185

 webhcat should be able to return detailed serde information when show table 
 using fromat=extended
 ---

 Key: HIVE-6044
 URL: https://issues.apache.org/jira/browse/HIVE-6044
 Project: Hive
  Issue Type: Bug
Reporter: Shuaishuai Nie
Assignee: Shuaishuai Nie
 Attachments: HIVE-6044.1.patch


 Now in webhcat, when using GET ddl/database/:db/table/:table and 
 format=extended, return value is based on query show table extended like. 
 However, this query doesn't contains serde info like line.delim and 
 filed.delim. In this case, user won't have enough information to 
 reconstruct the exact same table based on the information from the json file. 
 The descExtendedTable function in HcatDelegator should also return extra 
 fields from query desc extended tablename which contains fields sd, 
 retention, parameters parametersSize and tableType.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16330: HIVE-6045- Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16330/
---

(Updated Dec. 18, 2013, 12:39 a.m.)


Review request for hive.


Changes
---

Today the hive-var is broken because of the following:

1. Beeline constructs jdbc url fragment of hive-var with '' delimiter
2. JDBC uses ';' delimiter to parse this hive-var fragment.

Original patch had changed JDBC parsing regex (part 2) to expect ''.  But some 
test cases were manually constructing JDBC url's with ';' as delimiter and 
checking the parsing works on that.  One option is the change the test, which 
should be fine for JDBC URL backward-compatibility as this is new feature for 
0.13.

But still I decided to minimize the impact and choose an alternate fix (part 
1), so that beeline constructs URL fragment with ';' delimiter.


Bugs: HIVE-6045
https://issues.apache.org/jira/browse/HIVE-6045


Repository: hive-git


Description
---

The implementation appends hivevars to the jdbc url in the form 
var1=val1var2=val2$var3-val3

but the regex used to parse this is expecting the delimiter to be ;.  Changed 
the regex to fit the hivevar format.


Diffs (updated)
-

  jdbc/src/java/org/apache/hive/jdbc/Utils.java 913dc46 

Diff: https://reviews.apache.org/r/16330/diff/


Testing
---

Looks like TestBeelineWithArgs is no longer being run, and there are a lot of 
failures there due to other changes even without this change.  Probably we need 
to move that test, and see if we can add a unit test there for this case.


Thanks,

Szehon Ho



Re: Review Request 16330: HIVE-6045- Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16330/
---

(Updated Dec. 18, 2013, 12:41 a.m.)


Review request for hive.


Changes
---

Today the hive-var is broken because of the following:

1. Beeline constructs jdbc url fragment of hive-var with '' delimiter
2. JDBC uses ';' delimiter to parse this hive-var fragment.

Original patch had changed JDBC parsing regex (part 2) to expect ''.  But some 
test cases were manually constructing JDBC url's with ';' as delimiter and 
checking the parsing works on that.  One option is the change the test, which 
should be fine for JDBC URL backward-compatibility as this is new feature for 
0.13.

But still I decided to minimize the impact and choose an alternate fix (part 
1), so that beeline constructs URL fragment with ';' delimiter.


Bugs: HIVE-6045
https://issues.apache.org/jira/browse/HIVE-6045


Repository: hive-git


Description
---

The implementation appends hivevars to the jdbc url in the form 
var1=val1var2=val2$var3-val3

but the regex used to parse this is expecting the delimiter to be ;.  Changed 
the regex to fit the hivevar format.


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java 1de5829 

Diff: https://reviews.apache.org/r/16330/diff/


Testing
---

Looks like TestBeelineWithArgs is no longer being run, and there are a lot of 
failures there due to other changes even without this change.  Probably we need 
to move that test, and see if we can add a unit test there for this case.


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-6045) Beeline hivevars is broken for more than one hivevar

2013-12-17 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6045:


Attachment: HIVE-6045.1.patch

Today the hive-var is broken because of the following:

1. Beeline constructs jdbc url fragment of hive-var with '' delimiter
2. JDBC uses ';' delimiter to parse this hive-var fragment.

Original patch had changed JDBC parsing regex (part 2) to expect ''.  But some 
test cases were manually constructing JDBC url's with ';' as delimiter and 
checking the parsing works on that.  One option is the change the test, which 
should be fine for JDBC URL backward-compatibility as this is new feature for 
0.13.

But still I decided to minimize the impact and choose an alternate fix (part 
1), so that beeline constructs URL fragment with ';' delimiter.

 Beeline hivevars is broken for more than one hivevar
 

 Key: HIVE-6045
 URL: https://issues.apache.org/jira/browse/HIVE-6045
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6045.1.patch, HIVE-6045.patch


 HIVE-4568 introduced --hivevar flag.  But if you specify more than one 
 hivevar, for example 
 {code}
 beeline --hivevar file1=/user/szehon/file1 --hivevar file2=/user/szehon/file2
 {code}
 then the variables during runtime get mangled to evaluate to:
 {code}
 file1=/user/szehon/file1file2=/user/szehon/file2
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


Re: Review Request 16299: HIVE-6013: Supporting Quoted Identifiers in Column Names

2013-12-17 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16299/#review30570
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/16299/#comment58532

class PatternValidator was recently introduced in HiveConf, which doesn't 
let user to specify invalid value for a config key. Using that here will be 
useful.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
https://reviews.apache.org/r/16299/#comment58545

Shall we remove this  if() altogether and thus also above newly introduced 
method?



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java
https://reviews.apache.org/r/16299/#comment58546

conf should be null here. If it is null, then its a bug. Also, returning 
null in those cases seems incorrect. Lets remove this null conf check.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
https://reviews.apache.org/r/16299/#comment58584

Since this method always return true, no need for this if block.



ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
https://reviews.apache.org/r/16299/#comment58585

There can never be the case that hiveconf == null. That will be a bug. Lets 
remove this null check.



ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
https://reviews.apache.org/r/16299/#comment58586

It will be good to document where all Identifier is used. Can be lifted 
straight from your html document.




ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
https://reviews.apache.org/r/16299/#comment58587

Good to add a note here saying QuotedIdentifier only optionally available 
for columns as of now.



ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g
https://reviews.apache.org/r/16299/#comment58588

Not related for this patch, but if you feel like it, ll be good to add 
comment about where CharSetNames are used. Not necessary though.


- Ashutosh Chauhan


On Dec. 16, 2013, 10:22 p.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16299/
 ---
 
 (Updated Dec. 16, 2013, 10:22 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan and Alan Gates.
 
 
 Bugs: HIVE-6013
 https://issues.apache.org/jira/browse/HIVE-6013
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 Introduce 'standard' quoted identifiers for columns only.
 At the langauage level this is turned on by a flag.
 At the metadata level we relax the constraint on column names.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fa3e048 
   itests/qtest/pom.xml 8c249a0 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
 3deed45 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveUtils.java eb26e7f 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 321759b 
   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 17e6aad 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g ed9917d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 1e6826f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d18ea03 
   ql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java 8fe2262 
   ql/src/test/queries/clientnegative/invalid_columns.q f8be8c8 
   ql/src/test/queries/clientpositive/quotedid_alter.q PRE-CREATION 
   ql/src/test/queries/clientpositive/quotedid_basic.q PRE-CREATION 
   ql/src/test/queries/clientpositive/quotedid_partition.q PRE-CREATION 
   ql/src/test/queries/clientpositive/quotedid_skew.q PRE-CREATION 
   ql/src/test/queries/clientpositive/quotedid_smb.q PRE-CREATION 
   ql/src/test/queries/clientpositive/quotedid_tblproperty.q PRE-CREATION 
   ql/src/test/results/clientnegative/invalid_columns.q.out 3311b0a 
   ql/src/test/results/clientpositive/quotedid_alter.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/quotedid_basic.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/quotedid_partition.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/quotedid_skew.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/quotedid_smb.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/quotedid_tblproperty.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/16299/diff/
 
 
 Testing
 ---
 
 added new tests for create, alter, delete, query with columns containing 
 special characters.
 Tests start with 

[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2013-12-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851198#comment-13851198
 ] 

Ashutosh Chauhan commented on HIVE-6013:


Approach looks ok to me. Some implementation level comments on RB.

One test scenario. If this is already covered in your test, feel free to 
ignore. Otherwise, can you add following test.
set hive.support.quoted.identifiers=column;
create table t1 (aa int, ab string);
select a.* from t1; -- this should select both columns.

Also, you mentioned in html doc, some of jdbc api methods need to change, but I 
don't see any changes in jdbc package.

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (HIVE-6052) metastore JDO filter pushdown for integers may produce unexpected results with non-normalized integer columns

2013-12-17 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-6052:
--

 Summary: metastore JDO filter pushdown for integers may produce 
unexpected results with non-normalized integer columns
 Key: HIVE-6052
 URL: https://issues.apache.org/jira/browse/HIVE-6052
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


If integer partition columns have values stores in non-canonical form, for 
example with leading zeroes, the integer filter doesn't work. That is because 
JDO pushdown uses substrings to compare for equality, and SQL pushdown is 
intentionally crippled to do the same to produce same results.
Probably, since both SQL pushdown and integers pushdown are just perf 
optimizations, we can remove it for JDO (or make configurable and disable by 
default), and uncripple SQL.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HIVE-5230) Better error reporting by async threads in HiveServer2

2013-12-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13851237#comment-13851237
 ] 

Thejas M Nair commented on HIVE-5230:
-

Rebased patch looks good. I will commit it shortly (already been +1'd).


 Better error reporting by async threads in HiveServer2
 --

 Key: HIVE-5230
 URL: https://issues.apache.org/jira/browse/HIVE-5230
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.12.0, 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5230.1.patch, HIVE-5230.1.patch, 
 HIVE-5230.10.patch, HIVE-5230.2.patch, HIVE-5230.3.patch, HIVE-5230.4.patch, 
 HIVE-5230.6.patch, HIVE-5230.7.patch, HIVE-5230.8.patch, HIVE-5230.9.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. When a background thread gets an error, currently 
 the client can only poll for the operation state and also the error with its 
 stacktrace is logged. However, it will be useful to provide a richer error 
 response like thrift API does with TStatus (which is constructed while 
 building a Thrift response object). 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


  1   2   >