[jira] [Commented] (HIVE-7052) Optimize split calculation time

2014-05-15 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995994#comment-13995994
 ] 

Rajesh Balamohan commented on HIVE-7052:


https://reviews.apache.org/r/21357/diff/#index_header

 Optimize split calculation time
 ---

 Key: HIVE-7052
 URL: https://issues.apache.org/jira/browse/HIVE-7052
 Project: Hive
  Issue Type: Bug
 Environment: hive + tez
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
  Labels: performance
 Attachments: HIVE-7052-profiler-1.png, HIVE-7052-profiler-2.png


 When running a TPC-DS query (query_27),  significant amount of time was spent 
 in split computation on a dataset of size 200 GB (ORC format).
 Profiling revealed that, 
 1. Lot of time was spent in Config's subtitutevar (regex) in 
 HiveInputFormat.getSplits() method.  
 2. FileSystem was created repeatedly in OrcInputFormat.generateSplitsInfo(). 
 I will attach the profiler snapshots soon.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6986) MatchPath fails with small resultExprString

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6986:
---

Status: Patch Available  (was: Open)

OK. +1

 MatchPath fails with small resultExprString
 ---

 Key: HIVE-6986
 URL: https://issues.apache.org/jira/browse/HIVE-6986
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Furcy Pin
Priority: Trivial
 Attachments: HIVE-6986.1.patch


 When using MatchPath, a query like this:
 select year
 from matchpath(on 
 flights_tiny 
 sort by fl_num, year, month, day_of_month  
   arg1('LATE.LATE+'), 
   arg2('LATE'), arg3(arr_delay  15), 
 arg4('year') 
)
 ;
 will fail with error message 
 FAILED: StringIndexOutOfBoundsException String index out of range: 6



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7060) Column stats give incorrect min and distinct_count

2014-05-15 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7060:
--

Description: 
It seems that the result from column statistics isn't correct on two measures 
for numeric columns: min (which is always 0) and distinct count. Here is an 
example:

{code}
select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
UserVisits_web_text_none;
...
OK
9   1
Time taken: 9.747 seconds, Fetched: 1 row(s)
{code}

The statisitics for the column:
{code}
PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
PREHOOK: type: DESCTABLE
PREHOOK: Input: default@uservisits_web_text_none
POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@uservisits_web_text_none
# col_name  data_type   min max 
num_nulls   distinct_count  avg_col_len 
max_col_len num_trues   num_falses  
comment

avgTimeOnSite   int 0   9   
0   11  null
nullnull   
{code}

  was:
It seems that the result from column statistics isn't correct on two measures 
for numeric columns: min (which is always 0) and distinct count. Here is an 
example:

{code}
select count(distinct avgTimeOnSite), min(avgTimeO from 
UserVisits_web_text_nonenSite) from UserVisits_web_text_none;
...
OK
9   1
Time taken: 9.747 seconds, Fetched: 1 row(s)
{code}

The statisitics for the column:
{code}
PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
PREHOOK: type: DESCTABLE
PREHOOK: Input: default@uservisits_web_text_none
POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@uservisits_web_text_none
# col_name  data_type   min max 
num_nulls   distinct_count  avg_col_len 
max_col_len num_trues   num_falses  
comment

avgTimeOnSite   int 0   9   
0   11  null
nullnull   
{code}


 Column stats give incorrect min and distinct_count
 --

 Key: HIVE-7060
 URL: https://issues.apache.org/jira/browse/HIVE-7060
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that the result from column statistics isn't correct on two measures 
 for numeric columns: min (which is always 0) and distinct count. Here is an 
 example:
 {code}
 select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
 UserVisits_web_text_none;
 ...
 OK
 9 1
 Time taken: 9.747 seconds, Fetched: 1 row(s)
 {code}
 The statisitics for the column:
 {code}
 PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
 PREHOOK: type: DESCTABLE
 PREHOOK: Input: default@uservisits_web_text_none
 POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
 POSTHOOK: type: DESCTABLE
 POSTHOOK: Input: default@uservisits_web_text_none
 # col_name  data_type   min max   
   num_nulls   distinct_count  avg_col_len 
 max_col_len num_trues   num_falses
   comment
 avgTimeOnSite   int 0   9 
   0   11  null
 nullnull   
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE

2014-05-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996997#comment-13996997
 ] 

Xuefu Zhang commented on HIVE-7050:
---

Thanks for the patch. Minor comments/questions on RB.

One clarification: are column stats shown when either EXTENDED or FORMATTED is 
specified? And only when column is specified? I think this is important for 
documentation purpose. It would be good if functional details can be put in the 
description area.

 Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
 -

 Key: HIVE-7050
 URL: https://issues.apache.org/jira/browse/HIVE-7050
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7050.1.patch


 There is currently no way to display the column level stats from hive CLI. It 
 will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-05-15 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/
---

(Updated May 14, 2014, 8:22 p.m.)


Review request for hive, Gopal V and Gunther Hagleitner.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 
  conf/hive-default.xml.template 2552560 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
 2dbe334 
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory.java 
accc312 
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseKeyFactory2.java 
1bd2352 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java fce77a8 
  ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java f5d4670 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 8854b19 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
9df425b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
64f0be2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 
008a8db 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 988959f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 55b7415 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
eef7656 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 
d4be78d 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
674ed48 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
f7b499b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
 65e3779 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
 093da55 
  ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be 
  ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
  ql/src/test/queries/clientpositive/tez_union.q f80d94c 
  ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out cb11b8b 
  ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 1c16024 
  ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 614a4a6 
  serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
  serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
 9079b9d 
  
serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java
 1b09d41 
  serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
5870884 
  
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
 bab505e 
  serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
6f344bb 
  serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
  serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java 
a99c7b4 
  serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 
435d6c6 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
82c1263 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 
b188c3f 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java 
98a35c7 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 
6c14081 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/objectinspector/LazyBinaryStructObjectInspector.java
 e5ea452 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 06d5c5e 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 
868dd4c 
  
serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java
 1fb49e5 

Diff: https://reviews.apache.org/r/18936/diff/


Testing

[jira] [Assigned] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter

2014-05-15 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-7048:
-

Assignee: Swarnim Kulkarni

 CompositeKeyHBaseFactory should not use FamilyFilter
 

 Key: HIVE-7048
 URL: https://issues.apache.org/jira/browse/HIVE-7048
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
Priority: Blocker

 HIVE-6411 introduced a more generic way to provide composite key 
 implementations via custom factory implementations. However it seems like the 
 CompositeHBaseKeyFactory implementation uses a FamilyFilter for row key scans 
 which doesn't seem appropriate. This should be investigated further and if 
 possible replaced with a RowRangeScanFilter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7036) get_json_object bug when extract list of list with index

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7036:


Attachment: HIVE-7036.1.patch.txt

 get_json_object bug when extract list of list with index
 

 Key: HIVE-7036
 URL: https://issues.apache.org/jira/browse/HIVE-7036
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.12.0
 Environment: all
Reporter: Ming Ma
Priority: Minor
  Labels: udf
 Attachments: HIVE-7036.1.patch.txt


 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFJson.java#L250
 this line should be out of the for-loop
 For example 
 json = '{h:[1, [2, 3], {i: 0}, [{p: 11}, {p: 12}, {pp: 13}]}'
 get_json_object(json, '$.h[*][0]') should return back the first node(if 
 exists) of every childrenof '$.h'
 which specifically should be 
 [2,{p:11}] 
 but hive returns only
 2
 because when hive pick the node '2' out, the tmp_jsonList will change to a 
 list only contains one node '2':
 [2]
 then it was assigned to variable jsonList, in the next loop, value of i would 
 be 2 which is greater than the size(always 1) of jsonList, then the loop 
 broke out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-05-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

committed to trunk

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
 HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
 HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
 HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
 HIVE-6430.14.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7041) DoubleWritable/ByteWritable should extend their hadoop counterparts

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996583#comment-13996583
 ] 

Hive QA commented on HIVE-7041:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12644205/HIVE-7041.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/190/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/190/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12644205

 DoubleWritable/ByteWritable should extend their hadoop counterparts
 ---

 Key: HIVE-7041
 URL: https://issues.apache.org/jira/browse/HIVE-7041
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-7041.1.patch


 Hive has its own implementations of 
 ByteWritable/DoubleWritable/ShortWritable.  We cannot replace usage of these 
 classes since they will break 3rd party UDFs/SerDes, however we can at least 
 extend from the Hadoop version of these classes when possible to avoid 
 duplicate code.
 When Hive finally moves to version 1.0 we might want to consider removing use 
 of these Hive-specific writables and switching over to using the Hadoop 
 version of these classes.
 ShortWritable didn't exist in Hadoop until 2.x so it looks like we can't do 
 it with this class until 0.20/1.x support is dropped from Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-05-15 Thread Sergey Shelukhin


 On May 8, 2014, 10:05 p.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java, line 405
  https://reviews.apache.org/r/18936/diff/13/?file=572109#file572109line405
 
  why do you need this? this seems to do the same thing as tag == -1?

it's more explicit and stays that way if someone resets tag later


 On May 8, 2014, 10:05 p.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java, line 470
  https://reviews.apache.org/r/18936/diff/13/?file=572109#file572109line470
 
  this should exist on the operator, but on the ReduceSinkDesc

when we set it, we are operating on already-created operator


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/#review42539
---


On May 9, 2014, 8:16 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18936/
 ---
 
 (Updated May 9, 2014, 8:16 p.m.)
 
 
 Review request for hive, Gopal V and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 
   conf/hive-default.xml.template 2552560 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  142bfd8 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 
   ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 f5d4670 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
  8854b19 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
 9df425b 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
 64f0be2 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
  008a8db 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
  988959f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
  55b7415 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 eef7656 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java
  d4be78d 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 674ed48 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7b499b 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
  65e3779 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
  093da55 
   ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be 
   ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
   ql/src/test/queries/clientpositive/tez_union.q f80d94c 
   ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 8350670 
   ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c 
   ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 
   serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
   serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
   
 serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
  9079b9d 
   
 serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java
  1b09d41 
   serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
 5870884 
   
 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
  bab505e 
   serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
 6f344bb 
   serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
   

[jira] [Commented] (HIVE-4803) LazyTimestamp should accept numeric values

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992701#comment-13992701
 ] 

Hive QA commented on HIVE-4803:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643698/HIVE-4803.2.patch.txt

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5496 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_timestamp_numerics
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/144/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/144/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643698

 LazyTimestamp should accept numeric values
 --

 Key: HIVE-4803
 URL: https://issues.apache.org/jira/browse/HIVE-4803
 Project: Hive
  Issue Type: Improvement
  Components: Types
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4803.2.patch.txt, HIVE-4803.D11565.1.patch


 LazyTimestamp accepts -mm-dd hh:mm:ss formatted string and 'NULL'. It 
 would be good to accept numeric form (which is milliseconds).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties

2014-05-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993006#comment-13993006
 ] 

Thejas M Nair commented on HIVE-6768:
-

[~ekoifman] Can you respond on Ashutosh's comment ? 

Looks like most of the changes in HIVE-5511 are not specific to the issue, but 
were general cleanup.  And the attached patch reverts changes that were 
specific to the log handling.


 remove 
 hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
 ---

 Key: HIVE-6768
 URL: https://issues.apache.org/jira/browse/HIVE-6768
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6768.patch


 now that MAPREDUCE-5806 is fixed we can remove 
 override-container-log4j.properties and and all the logic around this which 
 was introduced in HIVE-5511 to work around MAPREDUCE-5806
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7045) Wrong results in multi-table insert aggregating without group by clause

2014-05-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7045:


Priority: Blocker  (was: Major)

 Wrong results in multi-table insert aggregating without group by clause
 ---

 Key: HIVE-7045
 URL: https://issues.apache.org/jira/browse/HIVE-7045
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.12.0
Reporter: dima machlin
Priority: Blocker

 This happens whenever there are more than 1 reducers.
 The scenario :
 CREATE  TABLE t1 (a int, b int);
 CREATE  TABLE t2 (cnt int) PARTITIONED BY (var_name string);
 insert into table t1 select 1,1 from asd limit 1;
 insert into table t1 select 2,2 from asd limit 1;
 t1 contains :
 1 1
 2 2
 from  t1
 insert overwrite table t2 partition(var_name='a') select count(a) cnt 
 insert overwrite table t2 partition(var_name='b') select count(b) cnt ;
 select * from t2;
 returns : 
 2 a
 2 b
 as expected.
 Setting the number of reducers higher than 1 :
 set mapred.reduce.tasks=2;
 from  t1
 insert overwrite table t2 partition(var_name='a') select count(a) cnt
 insert overwrite table t2 partition(var_name='b') select count(b) cnt;
 select * from t2;
 1 a
 1 a
 1 b
 1 b
 Wrong results.
 This happens when ever t1 is big enough to automatically generate more than 1 
 reducers and without specifying it directly.
 adding group by 1 in the end of each insert solves the problem :
 from  t1
 insert overwrite table t2 partition(var_name='a') select count(a) cnt group 
 by 1
 insert overwrite table t2 partition(var_name='b') select count(b) cnt group 
 by 1;
 generates : 
 2 a
 2 b
 This should work without the group by...
 The number of rows for each partition will be the amount of reducers.
 Each reducer calculated a sub total of the count.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6809) Support bulk deleting directories for partition drop with partial spec

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992839#comment-13992839
 ] 

Hive QA commented on HIVE-6809:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643703/HIVE-6809.5.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5429 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/145/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643703

 Support bulk deleting directories for partition drop with partial spec
 --

 Key: HIVE-6809
 URL: https://issues.apache.org/jira/browse/HIVE-6809
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
 HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt


 In busy hadoop system, dropping many of partitions takes much more time than 
 expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
 took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
 I couldn't test this in recent hive, which has HIVE-6256 but if the 
 time-taking part is mostly from removing directories, it seemed not helpful 
 to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4576) templeton.hive.properties does not allow values with commas

2014-05-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4576:
---

Fix Version/s: 0.13.1

 templeton.hive.properties does not allow values with commas
 ---

 Key: HIVE-4576
 URL: https://issues.apache.org/jira/browse/HIVE-4576
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.5.0
Reporter: Vitaliy Fuks
Assignee: Eugene Koifman
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-4576.0.13.patch, HIVE-4576.2.patch, HIVE-4576.patch


 templeton.hive.properties accepts a comma-separated list of key=value 
 property pairs that will be passed to Hive.
 However, this makes it impossible to use any value that itself has a comma 
 in it.
 For example:
 {code:xml}property
   nametempleton.hive.properties/name
   
 valuehive.metastore.sasl.enabled=false,hive.metastore.uris=thrift://foo1.example.com:9083,foo2.example.com:9083/value
 /property{code}
 {noformat}templeton: starting [/usr/bin/hive, --service, cli, --hiveconf, 
 hive.metastore.sasl.enabled=false, --hiveconf, 
 hive.metastore.uris=thrift://foo1.example.com:9083, --hiveconf, 
 foo2.example.com:9083 etc..{noformat}
 because the value is parsed using standard 
 org.apache.hadoop.conf.Configuration.getStrings() call which simply splits on 
 commas from here:
 {code:java}for (String prop : 
 appConf.getStrings(AppConfig.HIVE_PROPS_NAME)){code}
 This is problematic for any hive property that itself has multiple values, 
 such as hive.metastore.uris above or hive.aux.jars.path.
 There should be some way to escape commas or a different delimiter should 
 be used.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7042) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2

2014-05-15 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7042:
-

Status: Open  (was: Patch Available)

 Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2
 --

 Key: HIVE-7042
 URL: https://issues.apache.org/jira/browse/HIVE-7042
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7042.1.patch


 stats_partscan_1_23.q and orc_createas1.q should use HiveInputFormat as 
 opposed to CombineHiveInputFormat. RCFile uses DefaultCodec for compression 
 (uses DEFLATE) which is not splittable. Hence using CombineHiveIF will yield 
 different results for these tests. ORC should use HiveIF to generate ORC 
 splits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7062) Support Streaming mode in Windowing

2014-05-15 Thread Harish Butani (JIRA)
Harish Butani created HIVE-7062:
---

 Summary: Support Streaming mode in Windowing
 Key: HIVE-7062
 URL: https://issues.apache.org/jira/browse/HIVE-7062
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani


1. Have the Windowing Table Function support streaming mode.
2. Have special handling for Ranking UDAFs.
3. Have special handling for Sum/Avg for fixed size Wdws.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6976) Show query id only when there's jobs on the cluster

2014-05-15 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6976:
-

Status: Patch Available  (was: Open)

 Show query id only when there's jobs on the cluster
 ---

 Key: HIVE-6976
 URL: https://issues.apache.org/jira/browse/HIVE-6976
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-6976.1.patch


 No need to print the query id for local-only execution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6945) issues with dropping partitions on Oracle

2014-05-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6945:
---

Fix Version/s: 0.13.1

 issues with dropping partitions on Oracle
 -

 Key: HIVE-6945
 URL: https://issues.apache.org/jira/browse/HIVE-6945
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-6945-0.13.1.patch, HIVE-6945.01.patch, 
 HIVE-6945.02.patch, HIVE-6945.patch


 1) Direct SQL is broken on Oracle due to the usage of NUMBER type which is 
 translated by DN into decimal rather than long. This appears to be specific 
 to some cases because it seemed to have worked before (different version of 
 Oracle? JDBC? DN? Maybe depends on whether db was auto-created).
 2) When partition dropping code falls back to JDO, it creates objects to 
 return, then drops partitions. It appears that dropping makes DN objects 
 invalid. We create metastore partition objects out of DN objects before drop, 
 however the list of partition column values is re-used, rather than copied, 
 into these. DN appears to clear this list during drop, so the returned object 
 becomes invalid and the exception is thrown.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode

2014-05-15 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-7054:
-

Attachment: HIVE-7054.2.patch

 Support ELT UDF in vectorized mode
 --

 Key: HIVE-7054
 URL: https://issues.apache.org/jira/browse/HIVE-7054
 Project: Hive
  Issue Type: New Feature
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-7054.2.patch, HIVE-7054.patch


 Implement support for ELT udf in vectorized execution mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler

2014-05-15 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995651#comment-13995651
 ] 

Swarnim Kulkarni commented on HIVE-6411:


[~xuefuz] Done. I have also linked to this newly created issue.

[1] https://issues.apache.org/jira/browse/HIVE-7048

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, 
 HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, 
 HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, 
 HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7011) HiveInputFormat's split generation isn't thread safe

2014-05-15 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7011:
-

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the review [~vikram.dixit]!

 HiveInputFormat's split generation isn't thread safe
 

 Key: HIVE-7011
 URL: https://issues.apache.org/jira/browse/HIVE-7011
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.13.0, 0.14.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.14.0

 Attachments: HIVE-7011.1.patch


 Tez will do split generation in parallel. Need to protect the inputformat 
 cache against concurrent access.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 21289: HIVE-7033 : grant statements should check if the role exists

2014-05-15 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21289/#review42625
---



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
https://reviews.apache.org/r/21289/#comment76447

This should be done within transaction. Else, this may result in TOCTU bug.


- Ashutosh Chauhan


On May 9, 2014, 11:14 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/21289/
 ---
 
 (Updated May 9, 2014, 11:14 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-7033
 https://issues.apache.org/jira/browse/HIVE-7033
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The following grant statement that grants to a role that does not exist 
 succeeds, but it should result in an error.
 
  grant all on t1 to role nosuchrole;
 
 Patch also fixes the handling of role names in some cases to be case 
 insensitive.
 
 
 Diffs
 -
 
   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 4b4f4f2 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrincipal.java
  62b8994 
   ql/src/test/queries/clientnegative/authorization_role_grant_nosuchrole.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_table_grant_nosuchrole.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_1_sql_std.q 79ae17a 
   ql/src/test/queries/clientpositive/authorization_role_grant1.q f89d0dc 
   ql/src/test/queries/clientpositive/authorization_role_grant2.q 984d7ed 
   
 ql/src/test/results/clientnegative/authorization_role_grant_nosuchrole.q.out 
 PRE-CREATION 
   
 ql/src/test/results/clientnegative/authorization_table_grant_nosuchrole.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 718ff31 
   ql/src/test/results/clientpositive/authorization_role_grant1.q.out 3c846eb 
   ql/src/test/results/clientpositive/authorization_role_grant2.q.out 1e8f88a 
 
 Diff: https://reviews.apache.org/r/21289/diff/
 
 
 Testing
 ---
 
 New tests included
 
 
 Thanks,
 
 Thejas Nair
 




[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7035:
-

Attachment: HIVE-7035.patch

 Templeton returns 500 for user errors - when job cannot be found
 

 Key: HIVE-7035
 URL: https://issues.apache.org/jira/browse/HIVE-7035
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7035.patch


 curl -i 
 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman'
  should return HTTP Status code 4xx when no such job exists; it currently 
 returns 500.
 {noformat}
 {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: 
 Application with id 'application_201304291205_0015' doesn't exist in 
 RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager
 .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol
 PBServiceImpl.java:120)\r\n\tat 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat
  org.apache.hado
 op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat
  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Serve
 r.java:2053)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat 
 java.security.AccessController.doPrivileged(Native Method)\r\n\tat 
 javax.security.auth.Subject.doAs(Subject.ja
 va:415)\r\n\tat 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat
  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n}
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6952) Hive 0.13 HiveOutputFormat breaks backwards compatibility

2014-05-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6952:
---

Fix Version/s: 0.13.1

 Hive 0.13 HiveOutputFormat breaks backwards compatibility
 -

 Key: HIVE-6952
 URL: https://issues.apache.org/jira/browse/HIVE-6952
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Costin Leau
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-6952.patch, HIVE-6952_branch-13.patch


 Hive 0.13 changed the signature of HiveOutputFormat (through commit r1527149) 
 breaking backwards compatibility with previous releases; the return type of 
 getHiveRecordWriter has been changed from RecordWriter to FSRecordWriter.
 FSRecordWriter introduces one new method on top of RecordWriter however it 
 does not extend the previous interface and it lives in a completely new 
 package.
 Thus code running fine on Hive 0.12 breaks on Hive 0.13. After the upgrade, 
 code running on HIve 0.13, will break on anything lower than this.
 This could have easily been avoided by extending the existing interface or 
 introducing a new one that RecordWriter could have extended going forward. By 
 changing the signature, the existing contract (and compatibility) has been 
 voided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys

2014-05-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997550#comment-13997550
 ] 

Xuefu Zhang commented on HIVE-6290:
---

User doc should go with HIVE-6411 also.

 Add support for hbase filters for composite keys
 

 Key: HIVE-6290
 URL: https://issues.apache.org/jira/browse/HIVE-6290
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Fix For: 0.14.0

 Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, 
 HIVE-6290.3.patch.txt


 Add support for filters to be provided via the composite key class



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS

2014-05-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993822#comment-13993822
 ] 

Thejas M Nair commented on HIVE-7031:
-

+1
testCliDriver_schemeAuthority2 is a flaky test, it passed when I ran locally. 
Other tests failures are unrelated.


 Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to 
 create an empty file in HDFS
 ---

 Key: HIVE-7031
 URL: https://issues.apache.org/jira/browse/HIVE-7031
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.14.0

 Attachments: HIVE-7031.1.patch


 This leads to inconsitent HDFS naming for empty partition/tables where a file 
 might be named as  hdfs://headnode0:9000/hive/scratch/hive_2
 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating 
 system



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5764) Stopping Metastore and HiveServer2 from command line

2014-05-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-5764:


Assignee: Xiaobing Zhou

 Stopping Metastore and HiveServer2 from command line
 

 Key: HIVE-5764
 URL: https://issues.apache.org/jira/browse/HIVE-5764
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Reporter: Vaibhav Gumashta
Assignee: Xiaobing Zhou
 Fix For: 0.14.0


 Currently a user needs to kill the process. Ideally there should be something 
 like:
 hive --service metastore stop
 hive --service hiveserver2 stop



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6693) CASE with INT and BIGINT fail

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-6693.
-

Resolution: Duplicate

 CASE with INT and BIGINT fail
 -

 Key: HIVE-6693
 URL: https://issues.apache.org/jira/browse/HIVE-6693
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.12.0
Reporter: David Gayou

 CREATE TABLE testCase (n BIGINT)
 select case when (n 3) then n else 0 end from testCase
 fail with error : 
 [Error 10016]: Line 1:36 Argument type mismatch '0': The expression after 
 ELSE should have the same type as those after THEN: bigint is expected but 
 int is found'.
 bigint and int should be more compatible, at least int should implictly cast 
 to bigint. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-15 Thread Sun Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995846#comment-13995846
 ] 

Sun Rui commented on HIVE-7012:
---

For the issue about distinct, I will investigate it later and if I can find a 
real test case, I will submit a separate jira.

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
Assignee: Navis
 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0 (type: bigint)
  

Re: Hive and MR2

2014-05-15 Thread Azuryy
Any inputs?


Sent from my iPhone5s

 On 2014年5月7日, at 18:11, Azuryy Yu azury...@gmail.com wrote:
 
 Hi,
 I am using hive-0.13.0 and hadoop-2.4.0,
 
 why I must set 'mapreduce.jobtracker.address' in yarn-site.xml? otherwise, 
 there are exceptions and job failed.
 
 And, 'mapreduce.jobtracker.address' can be set to any value.
 
 The following messages are gened without set 'mapreduce.jobtracker.address'.
 
 Job output on the console:
 Execution log at: 
 /tmp/test/test_20140507180505_bcd4d89f-017c-4cf4-81a3-5fa619de0ad0.log
 Job running in-process (local Hadoop)
 Hadoop job information for null: number of mappers: 1; number of reducers: 1
 2014-05-07 18:06:25,782 null map = 0%,  reduce = 0%
 2014-05-07 18:06:33,699 null map = 100%,  reduce = 0%
 2014-05-07 18:06:34,774 null map = 0%,  reduce = 0%
 2014-05-07 18:06:49,222 null map = 100%,  reduce = 100%
 Ended Job = job_1399453944131_0006 with errors
 Error during job, obtaining debugging information...
 
 Container error:
 2014-05-07 18:06:33,634 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: 
 No plan file found: 
 file:/tmp/test/hive_2014-05-07_18-06-08_349_1526907284076641211-1/-mr-10001/0a1c9ebe-cdb0-4adc-9e93-8f176019f19a/map.xml
 2014-05-07 18:06:33,635 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
 
 
 
 


[jira] [Created] (HIVE-7064) Remove Noop PTFs from FunctionRegistry and special handling within PTF translation

2014-05-15 Thread Harish Butani (JIRA)
Harish Butani created HIVE-7064:
---

 Summary: Remove Noop PTFs from FunctionRegistry and special 
handling within PTF translation
 Key: HIVE-7064
 URL: https://issues.apache.org/jira/browse/HIVE-7064
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani


It is time to remove special handling of Noop PTFs from translation code. These 
should not be exposed as OOB functions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Apache Hive 0.13.1

2014-05-15 Thread Sushanth Sowmyan
Hi folks,

As an update, HIVE-6945 has some 0.13.1 specific test fixes appended
which make its tests pass, the test that was failing with HIVE-6826 is
now succeeding(flaky test), and Thejas has confirmed with me that the
issue with HIVE-6846 is a test problem, not a product problem,
relating to an incorrect expectation in the test.

With those resolved, there are no more blockers, and no additional
jiras that have been requested to be part of this release, so I'll go
ahead and spin out RC0 now, and will also commit all those patches to
the 0.13 branch. :)



On Wed, May 7, 2014 at 7:22 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 After much experimentation with git bisect (which is very powerful),
 I've narrowed down the test failures reported yesterday. The failures
 are appearing from the following:

 HIVE-6945:
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property

 HIVE-6846:
 org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs

 HIVE-6826:
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6


 Of the above, the second jira was already in 0.13.0. I'll comment up
 on those jiras asking the committers involved in those bugs and to
 help debug the issue. If anyone is interested in the git bisect logs
 for these, they're available on
 http://people.apache.org/~khorgath/releases/0.13.1_RC0/test_failures/


 On Tue, May 6, 2014 at 6:41 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Also, I wanted to throw in one more bit for those of you that are
 interested in tinkering along :

 http://people.apache.org/~khorgath/releases/0.13.1_RC0/relprep.pl
 http://people.apache.org/~khorgath/releases/0.13.1_RC0/requested_jiras

 This is the script and config file I'm using to generate this release.

 It's very much a hack right now, and I hope to improve it to
 streamline releases in the future, but how it can be used right now is
 this way:

 a) Put it in a hive git repo (and not have any changes that have not
 been committed - this script will checkout a new branch and commit
 things to that branch, so you want to make sure to have a clean repo)
 b) Put the file requested_jiras in that dir as well.
 c) Run the script from there.

 It will check the differences between the branch being released
 (branch-0.13 is hardcoded currently as a global), and looks at all
 the commit logs in trunk that correspond to the jiras requested in the
 requested_jiras file, sorts them in the order they were committed, and
 then checks out a new branch called relprep-branch-0.13-timestamp,
 and attempts to cherry-pick those commits in.

 For some patches, this will not work, so there is an override
 mechanism provided by entries in the requested_jiras file, as can be
 observed in the file I mention above.

 At the end of it, you'll have your 0.13.1 repo reproduction to test
 against if you so desire.

 Known Bugs :

 a) I use system() or die ...;, which is faulty in that the
 die code will never be reached. I need to fix this, but all the
 system calls were working for me, and I'd much rather focus on the
 release now, and improve this script later. This is a TODO
 b) Some patches (those generated with --no-prefix) don't work with
 older versions of git. You'll need a 1.8.x git for them, or you have
 to generate git patches without --no-prefix.



 On Tue, May 6, 2014 at 6:21 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Hi Folks,

 After a run of the ptest framework across the 0.13.1 codebase, we have
 a couple of test failures that I'm trying to track down and resolve.

 If any of you are interested in looking at it on your own in the
 meanwhile, the conglomerate patch of all the patches I'm forward
 porting into 0.13.1 is over at
 http://people.apache.org/~khorgath/releases/0.13.1_RC0/0.13.1.gdiff.patch

 The current tests that are failing are as follows:

 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property
 org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs

 I'll update and follow up with patch devs as and when I find out the
 source for these errors.

 Thanks,
 -Sushanth

 On Mon, May 5, 2014 at 6:26 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Hi Folks,

 It's past 6pm PDT on May 5th 2014, so I'm beginning the process to
 generate the 0.13.1 RC0.

 I've received backport patches for 

[jira] [Commented] (HIVE-7026) Support newly added role related APIs for v1 authorizer

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993021#comment-13993021
 ] 

Hive QA commented on HIVE-7026:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643702/HIVE-7026.1.patch.txt

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 5428 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_complex_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_keyword_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_roles
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_7
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_part
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_create
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_drop
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_show_role_principals_v1
org.apache.hadoop.hive.jdbc.TestJdbcDriver.testShowGrant
org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/146/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/146/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643702

 Support newly added role related APIs for v1 authorizer
 ---

 Key: HIVE-7026
 URL: https://issues.apache.org/jira/browse/HIVE-7026
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-7026.1.patch.txt, HIVE-7026.2.patch.txt


 Support SHOW_CURRENT_ROLE and SHOW_ROLE_PRINCIPALS for v1 authorizer. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7058) Cleanup HiveHBase*InputFormat

2014-05-15 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created HIVE-7058:
--

 Summary: Cleanup HiveHBase*InputFormat
 Key: HIVE-7058
 URL: https://issues.apache.org/jira/browse/HIVE-7058
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Nick Dimiduk


Once HBase mapred API as support for providing a Scan instance, we should clean 
up the code around HBase InputFormats to make use of it and share common 
predicate pushdown logic.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7031:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Hari!

 Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to 
 create an empty file in HDFS
 ---

 Key: HIVE-7031
 URL: https://issues.apache.org/jira/browse/HIVE-7031
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.14.0

 Attachments: HIVE-7031.1.patch


 This leads to inconsitent HDFS naming for empty partition/tables where a file 
 might be named as  hdfs://headnode0:9000/hive/scratch/hive_2
 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating 
 system



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7065:
-

Status: Patch Available  (was: Open)

 Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
 -

 Key: HIVE-7065
 URL: https://issues.apache.org/jira/browse/HIVE-7065
 Project: Hive
  Issue Type: Bug
  Components: Tez, WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7065.patch


 WebHCat config has templeton.hive.properties to specify Hive config 
 properties that need to be passed to Hive client on node executing a job 
 submitted through WebHCat (hive query, for example).
 this should include hive.execution.engine



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5631) Index creation on a skew table fails

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5631:


Fix Version/s: (was: 0.13.0)
   0.14.0

 Index creation on a skew table fails
 

 Key: HIVE-5631
 URL: https://issues.apache.org/jira/browse/HIVE-5631
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.14.0

 Attachments: HIVE-5631.1.patch.txt, HIVE-5631.2.patch.txt, 
 HIVE-5631.3.patch.txt


 REPRO STEPS:
 create database skewtest;
 use skewtest;
 create table skew (id bigint, acct string) skewed by (acct) on ('CC','CH');
 create index skew_indx on table skew (id) as 
 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED 
 REBUILD;
 Last DDL fails with following error.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 InvalidObjectException(message:Invalid skew column [acct])
 When creating a table, Hive has sanity tests to make sure the columns have 
 proper names and the skewed columns are subset of the table columns. Here we 
 fail because index table has skewed column info. Index tables's skewed 
 columns include {acct} and the columns are {id, _bucketname, _offsets}. As 
 the skewed column {acct} is not part of the table columns Hive throws the 
 exception.
 The reason why Index table got skewed column info even though its definition 
 has no such info is: When creating the index table a deep copy of the base 
 table's StorageDescriptor (SD) (in this case 'skew') is made. And in that 
 copied SD, index specific parameters are set and unrelated parameters are 
 reset. Here skewed column info is not reset (there are few other params that 
 are not reset). That's why the index table contains the skewed column info.
 Fix: Instead of deep copying the base table StorageDescriptor, create a new 
 one from gathered info. This way it avoids the index table to inherit 
 unnecessary properties in SD from base table.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found

2014-05-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993011#comment-13993011
 ] 

Thejas M Nair commented on HIVE-7035:
-

+1

 Templeton returns 500 for user errors - when job cannot be found
 

 Key: HIVE-7035
 URL: https://issues.apache.org/jira/browse/HIVE-7035
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7035.patch


 curl -i 
 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman'
  should return HTTP Status code 4xx when no such job exists; it currently 
 returns 500.
 {noformat}
 {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: 
 Application with id 'application_201304291205_0015' doesn't exist in 
 RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager
 .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol
 PBServiceImpl.java:120)\r\n\tat 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat
  org.apache.hado
 op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat
  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Serve
 r.java:2053)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat 
 java.security.AccessController.doPrivileged(Native Method)\r\n\tat 
 javax.security.auth.Subject.doAs(Subject.ja
 va:415)\r\n\tat 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat
  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n}
 {noformat}
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7030) Remove hive.hadoop.classpath from hiveserver2.cmd

2014-05-15 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7030:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Remove hive.hadoop.classpath from hiveserver2.cmd
 -

 Key: HIVE-7030
 URL: https://issues.apache.org/jira/browse/HIVE-7030
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.14.0

 Attachments: HIVE-7030.1.patch


 This parameter is not used anywhere and should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-05-15 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/#review42539
---



ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java
https://reviews.apache.org/r/18936/#comment76332

This is nice. But should have documentation for class and public methods



ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java
https://reviews.apache.org/r/18936/#comment76333

There has to be a more portable way to create temp file.



ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
https://reviews.apache.org/r/18936/#comment76334

Can you make this a jira or drop if it's not important enough?



ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
https://reviews.apache.org/r/18936/#comment76335

coding standards



ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
https://reviews.apache.org/r/18936/#comment76337

why do you need this? this seems to do the same thing as tag == -1?



ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
https://reviews.apache.org/r/18936/#comment76338

this should exist on the operator, but on the ReduceSinkDesc



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
https://reviews.apache.org/r/18936/#comment76340

needs asf header



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
https://reviews.apache.org/r/18936/#comment76341

can you please use curlies in this file. coding standard again.



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
https://reviews.apache.org/r/18936/#comment76342

same as before. todos shoudl be jiras or removed if not important



ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
https://reviews.apache.org/r/18936/#comment76336

if debugenabled?


- Gunther Hagleitner


On May 1, 2014, 2:29 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18936/
 ---
 
 (Updated May 1, 2014, 2:29 a.m.)
 
 
 Review request for hive, Gopal V and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 
   conf/hive-default.xml.template 2552560 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  142bfd8 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 
   ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 f5d4670 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
  8854b19 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
 9df425b 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
 64f0be2 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
  008a8db 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
  988959f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
  55b7415 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 eef7656 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java
  d4be78d 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 3077d75 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7b499b 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
  65e3779 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
  093da55 
   

[jira] [Updated] (HIVE-6999) Add streaming mode to PTFs

2014-05-15 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6999:


Attachment: HIVE-6999.3.patch

fix show_functions.q.out diff

 Add streaming mode to PTFs
 --

 Key: HIVE-6999
 URL: https://issues.apache.org/jira/browse/HIVE-6999
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch, HIVE-6999.3.patch


 There are a set of use cases where the Table Function can operate on a 
 Partition row by row or on a subset(window) of rows as it is being streamed 
 to it.
 - Windowing has couple of use cases of this:processing of Rank functions, 
 processing of Window Aggregations.
 - But this is a generic concept: any analysis that operates on an Ordered 
 partition maybe able to operate in Streaming mode.
 This patch introduces streaming mode in PTFs and provides the mechanics to 
 handle PTF chains that contain both modes of PTFs.
 Subsequent patches will introduce Streaming mode for Windowing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-7066 started by David Chen.

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen

 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148)
   at 
 

[jira] [Updated] (HIVE-5908) Use map-join hint to cache intermediate result

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5908:


Fix Version/s: (was: 0.13.0)
   0.14.0

 Use map-join hint to cache intermediate result
 --

 Key: HIVE-5908
 URL: https://issues.apache.org/jira/browse/HIVE-5908
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: daxingyu
Priority: Minor
  Labels: features
 Fix For: 0.14.0

   Original Estimate: 72h
  Remaining Estimate: 72h

 There are very complicated query exists in our project , especially some 
 intermediate result can be very small . But hive will treat these result as a 
 part  of mapreduce job , which is very costly.
 So i am proposed to use map-join hint to cache these small result, and speed 
 up the hive job executions. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 21471: HIVE-7066: hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21471/
---

Review request for hive.


Bugs: HIVE-7066
https://issues.apache.org/jira/browse/HIVE-7066


Repository: hive-git


Description
---

Restores the Avro core jar in the hive-exec jar. The hive-exec jar only 
contained avro-mapred but not core Avro, which caused the AvroSerDe to break.


Diffs
-

  ql/pom.xml 71daa26 

Diff: https://reviews.apache.org/r/21471/diff/


Testing
---

Confirmed that core Avro is now included in the hive-exec jar. Successfully ran 
sample query against table registered with the AvroSerDe.


Thanks,

David Chen



[jira] [Updated] (HIVE-7036) get_json_object bug when extract list of list with index

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7036:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 get_json_object bug when extract list of list with index
 

 Key: HIVE-7036
 URL: https://issues.apache.org/jira/browse/HIVE-7036
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.12.0, 0.13.0
 Environment: all
Reporter: Ming Ma
Assignee: Navis
Priority: Minor
  Labels: udf
 Fix For: 0.14.0

 Attachments: HIVE-7036.1.patch.txt


 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFJson.java#L250
 this line should be out of the for-loop
 For example 
 json = '{h:[1, [2, 3], {i: 0}, [{p: 11}, {p: 12}, {pp: 13}]}'
 get_json_object(json, '$.h[*][0]') should return back the first node(if 
 exists) of every childrenof '$.h'
 which specifically should be 
 [2,{p:11}] 
 but hive returns only
 2
 because when hive pick the node '2' out, the tmp_jsonList will change to a 
 list only contains one node '2':
 [2]
 then it was assigned to variable jsonList, in the next loop, value of i would 
 be 2 which is greater than the size(always 1) of jsonList, then the loop 
 broke out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6908) TestThriftBinaryCLIService.testExecuteStatementAsync has intermittent failures

2014-05-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995634#comment-13995634
 ] 

Ashutosh Chauhan commented on HIVE-6908:


+1

 TestThriftBinaryCLIService.testExecuteStatementAsync has intermittent failures
 --

 Key: HIVE-6908
 URL: https://issues.apache.org/jira/browse/HIVE-6908
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6908.patch


 This has failed sometimes in the pre-commit tests.
 ThriftCLIServiceTest.testExecuteStatementAsync runs two statements.  They are 
 given 100 second timeout total, not sure if its by intention.  As the first 
 is a select query, it will take a majority of the time.  The second statement 
 (create table) should be quicker, but it fails sometimes because timeout is 
 already mostly used up.
 The timeout should probably be reset after the first statement.  If the 
 operation finishes before the timeout, it wont have any effect as it'll break 
 out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998274#comment-13998274
 ] 

David Chen commented on HIVE-7066:
--

I have posted a patch for a fix. I have tested this on trunk by confirming that 
Avro core is in the hive-exec jar and successfully running a simple Hive query 
against a table registered with the AvroSerDe. The fix was a simple 1 line 
change. It looks like this issue was caused by the Ant - Maven switch and the 
avro core jar was inadvertently left out when creating the hive-exec  jar.

I am not able to create an RB right now because RB is giving me a 502 error 
when I try to create a new review request, both using {{rbt post}} and manually 
via the RB web UI. I will try to create a RB later. 

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen
 Attachments: HIVE-7066.1.patch


 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 

Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-15 Thread Eugene Koifman
After upgrading Hadoop version, TestHive_7 is the only issue as explained
below.
RC looks good.


On Tue, May 13, 2014 at 8:14 PM, Eugene Koifman ekoif...@hortonworks.comwrote:

 TestHive_7 is explained by https://issues.apache.org/jira/browse/HIVE-6521,
 which is in trunk but not 13.1


 On Tue, May 13, 2014 at 6:50 PM, Eugene Koifman 
 ekoif...@hortonworks.comwrote:

 I downloaded src tar, built it and ran webhcat e2e tests.
 I see 2 failures (which I don't see on trunk)

 TestHive_7 fails with
 got percentComplete map 100% reduce 0%,  expected  map 100% reduce 100%

 TestHeartbeat_1 fails to even launch the job.  This looks like the root
 cause

 ERROR | 13 May 2014 18:24:00,394 |
 org.apache.hive.hcatalog.templeton.CatchallExceptionMapper |
 java.lang.NullPointerException
 at
 org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:312)
 at
 org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:479)
 at
 org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:170)
 at
 org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:107)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:103)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator.queueAsUser(LauncherDelegator.java:103)
  at
 org.apache.hive.hcatalog.templeton.LauncherDelegator.enqueueController(LauncherDelegator.java:81)
 at
 org.apache.hive.hcatalog.templeton.JarDelegator.run(JarDelegator.java:55)
 at
 org.apache.hive.hcatalog.templeton.Server.mapReduceJar(Server.java:711)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at
 com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
 at
 com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 at
 com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
 at
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at
 com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at
 com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350)
 at
 com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
 at
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538)
 at
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360)
 at
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:392)
 at
 org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:87)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)
 at
 

[jira] [Assigned] (HIVE-5733) Publish hive-exec artifact without all the dependencies

2014-05-15 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu reassigned HIVE-5733:
-

Assignee: Amareshwari Sriramadasu

 Publish hive-exec artifact without all the dependencies
 ---

 Key: HIVE-5733
 URL: https://issues.apache.org/jira/browse/HIVE-5733
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Jarek Jarcec Cecho
Assignee: Amareshwari Sriramadasu

 Currently the artifact {{hive-exec}} that is available in 
 [maven|http://search.maven.org/remotecontent?filepath=org/apache/hive/hive-exec/0.12.0/hive-exec-0.12.0.jar]
  is shading all the dependencies (= the jar contains all Hive's 
 dependencies). As other projects that are depending on Hive might be use 
 slightly different version of the dependencies, it can easily happens that 
 Hive's shaded version will be used instead which leads to very time consuming 
 debugging of what is happening (for example SQOOP-1198).
 Would it be feasible publish {{hive-exec}} jar that will be build without 
 shading any dependency? For example 
 [avro-tools|http://search.maven.org/#artifactdetails%7Corg.apache.avro%7Cavro-tools%7C1.7.5%7Cjar]
  is having classifier nodeps that represents artifact without any 
 dependencies.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Hive Error Log -Thanks for your help!

2014-05-15 Thread 李坤霖

Hi~
When I run a hive statement(select * from lab.ec_web_log limit 100), I got an 
error.
Should I do anything for fixing it?
Thanks for your help!

Lab.ec_web_log create statement:
CREATE external TABLE lab.ec_web_log (
host STRING, ipaddress STRING, identd STRING, user STRING,finishtime STRING,
requestline STRING, returncode INT, size INT, getstr STRING, retstatus INT, 
v_P03_1 STRING, v_P04 STRING,
v_P06 STRING, v_P08 STRING, v_P09 STRING, v_P10 STRING, v_P11 STRING, v_P12 
STRING, v_P13 STRING, v_P14 STRING, v_P15 STRING, v_P16 STRING, v_P17 STRING, 
v_P18 STRING, v_P19 STRING, v_P20 STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe'
WITH SERDEPROPERTIES (
'serialization.format'='org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol',
'quote.delim'='(|\\[|\\])',
'field.delim'=' ',
'serialization.null.format'='-')
STORED AS TEXTFILE
LOCATION '/user/audil/weblog/';

Web log format:
xxx..com xxx.xxx.xxx.xxx - - [04/May/2014:23:59:59 +0800] 1 1248214 GET 
/buy/index.php?action=product_detailprod_no=P200382387prod_sort_uid=3304 
HTTP/1.1 200 30975 202.39.48.37 - Mozilla/5.0 (Windows NT 6.1; WOW64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36 -

Error List:
2014-05-14 13:55:07,751 WARN  snappy.LoadSnappy (LoadSnappy.java:clinit(36)) 
- Snappy native library is available
2014-05-14 15:01:24,303 WARN  mapred.JobClient 
(JobClient.java:copyAndConfigureFiles(746)) - Use GenericOptionsParser for 
parsing the arguments. Applications should implement Tool for the same.
2014-05-14 15:42:09,652 ERROR exec.Task (SessionState.java:printError(410)) - 
Ended Job = job_201404092012_0138 with errors
2014-05-14 15:42:09,655 ERROR exec.Task (SessionState.java:printError(410)) - 
Error during job, obtaining debugging information...
2014-05-14 15:42:09,656 ERROR exec.Task (SessionState.java:printError(410)) - 
Job Tracking URL: 
http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201404092012_0138
2014-05-14 15:42:09,659 ERROR exec.Task (SessionState.java:printError(410)) - 
Examining task ID: task_201404092012_0138_m_02 (and more) from job 
job_201404092012_0138
2014-05-14 15:42:09,878 ERROR exec.Task (SessionState.java:printError(410)) -
Task with the most failures(4):
-
Task ID:
  task_201404092012_0138_m_00

URL:
  
http://hdp001-jt:50030/taskdetails.jsp?jobid=job_201404092012_0138tipid=task_201404092012_0138_m_00
-
Diagnostic Messages for this Task:
Task attempt_201404092012_0138_m_00_3 failed to report status for 600 
seconds. Killing!

2014-05-14 15:42:09,900 ERROR ql.Driver (SessionState.java:printError(410)) - 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask
2014-05-14 15:56:30,759 ERROR ql.Driver (SessionState.java:printError(410)) - 
FAILED: ParseException line 1:0 cannot recognize input near 'conf' '.' 'set'

org.apache.hadoop.hive.ql.parse.ParseException: line 1:0 cannot recognize input 
near 'conf' '.' 'set'

at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:193)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

_DISCLAIMER : This message (and any attachments) may contain information that 
is confidential, proprietary, privileged or otherwise protected by law. The 
message is intended solely for the named addressee (or a person responsible for 
delivering it to the addressee). If you are not the intended recipient of this 
message, you are not authorized to read, print, retain, copy or disseminate 
this message or any part of it. If you have received this message in error, 
please destroy the message or delete it from your system immediately and notify 
the sender.


[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-15 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Description: 
on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for \*hcatalog-core-\*.jar etc.  In Pig 12.1 it's looking for 
hcatalog-core-\*.jar, which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 

[jira] [Created] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results

2014-05-15 Thread Prasanth J (JIRA)
Prasanth J created HIVE-7067:


 Summary: Min() and Max() on Timestamp and Date columns for ORC 
returns wrong results
 Key: HIVE-7067
 URL: https://issues.apache.org/jira/browse/HIVE-7067
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J


min() and max() of timestamp and date columns of ORC table returns wrong 
results. The reason for that is when ORC creates object inspectors for date and 
timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. When 
get() is performed on java primitive objects, a reference to the underlying 
object is returned whereas when get() is performed on writable objects, a copy 
of the underlying object is returned. 

Fix is to change the object inspector creation to return writable objects for 
timestamp and date.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-05-15 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997920#comment-13997920
 ] 

Sushanth Sowmyan commented on HIVE-6473:


Patch looks good to me. I'll try to kick off some tests on this myself.

One more thing though - you remove 
hbase-handler/src/test/queries/positive/hbase_bulk.m in this patch, but you do 
not remove the corresponding 
hbase-handler/src/test/results/positive/hbase_bulk.m.out file. Could you add 
that removal as well?

I'm +1 on it otherwise though, and will commit once we have a test run.

 Allow writing HFiles via HBaseStorageHandler table
 --

 Key: HIVE-6473
 URL: https://issues.apache.org/jira/browse/HIVE-6473
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
 HIVE-6473.1.patch.txt


 Generating HFiles for bulkload into HBase could be more convenient. Right now 
 we require the user to register a new table with the appropriate output 
 format. This patch allows the exact same functionality, but through an 
 existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results

2014-05-15 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7067:
-

Priority: Critical  (was: Major)

 Min() and Max() on Timestamp and Date columns for ORC returns wrong results
 ---

 Key: HIVE-7067
 URL: https://issues.apache.org/jira/browse/HIVE-7067
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Critical
 Attachments: HIVE-7067.1.patch


 min() and max() of timestamp and date columns of ORC table returns wrong 
 results. The reason for that is when ORC creates object inspectors for date 
 and timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. 
 When get() is performed on java primitive objects, a reference to the 
 underlying object is returned whereas when get() is performed on writable 
 objects, a copy of the underlying object is returned. 
 Fix is to change the object inspector creation to return writable objects for 
 timestamp and date.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 21289: HIVE-7033 : grant statements should check if the role exists

2014-05-15 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21289/
---

(Updated May 9, 2014, 11:14 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

HIVE-7033.2.patch - updating comment in .q file


Bugs: HIVE-7033
https://issues.apache.org/jira/browse/HIVE-7033


Repository: hive-git


Description
---

The following grant statement that grants to a role that does not exist 
succeeds, but it should result in an error.

 grant all on t1 to role nosuchrole;

Patch also fixes the handling of role names in some cases to be case 
insensitive.


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 4b4f4f2 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HivePrincipal.java
 62b8994 
  ql/src/test/queries/clientnegative/authorization_role_grant_nosuchrole.q 
PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_table_grant_nosuchrole.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_1_sql_std.q 79ae17a 
  ql/src/test/queries/clientpositive/authorization_role_grant1.q f89d0dc 
  ql/src/test/queries/clientpositive/authorization_role_grant2.q 984d7ed 
  ql/src/test/results/clientnegative/authorization_role_grant_nosuchrole.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_table_grant_nosuchrole.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_1_sql_std.q.out 718ff31 
  ql/src/test/results/clientpositive/authorization_role_grant1.q.out 3c846eb 
  ql/src/test/results/clientpositive/authorization_role_grant2.q.out 1e8f88a 

Diff: https://reviews.apache.org/r/21289/diff/


Testing
---

New tests included


Thanks,

Thejas Nair



[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6901:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Xuefu!

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6109.10.patch, HIVE-6901.1.patch, 
 HIVE-6901.2.patch, HIVE-6901.3.patch, HIVE-6901.4.patch, HIVE-6901.5.patch, 
 HIVE-6901.6.patch, HIVE-6901.7.patch, HIVE-6901.8.patch, HIVE-6901.9.patch, 
 HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7031) Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to create an empty file in HDFS

2014-05-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993787#comment-13993787
 ] 

Ashutosh Chauhan commented on HIVE-7031:


+1

 Utiltites.createEmptyFile uses File.Separator instead of Path.Separator to 
 create an empty file in HDFS
 ---

 Key: HIVE-7031
 URL: https://issues.apache.org/jira/browse/HIVE-7031
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.14.0

 Attachments: HIVE-7031.1.patch


 This leads to inconsitent HDFS naming for empty partition/tables where a file 
 might be named as  hdfs://headnode0:9000/hive/scratch/hive_2
 014-04-07_22-39-52_649_4046112898053848089-1/-mr-10010\0 in windows operating 
 system



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6438) Sort query result for test, removing order by clause

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992600#comment-13992600
 ] 

Hive QA commented on HIVE-6438:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643697/HIVE-6438.4.patch.txt

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 5428 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join34
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/143/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/143/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643697

 Sort query result for test, removing order by clause 
 -

 Key: HIVE-6438
 URL: https://issues.apache.org/jira/browse/HIVE-6438
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6438.1.patch.txt, HIVE-6438.2.patch.txt, 
 HIVE-6438.3.patch.txt, HIVE-6438.4.patch.txt


 For acquiring conformed output in various hadoop versions, most queries have 
 order-by clause. If we support test declaration similar to SORT_BEFORE_DIFF 
 which is for sorting output per query. We can remove order-by clauses and 
 reduce the test time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 21168: HIVE-6999: Add streaming mode to PTFs

2014-05-15 Thread Harish Butani


 On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java, line 58
  https://reviews.apache.org/r/21168/diff/1/?file=576144#file576144line58
 
  Can you add a comment why we need to keep track for first row processed 
  in Map?

done


 On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java, line 319
  https://reviews.apache.org/r/21168/diff/1/?file=576144#file576144line319
 
  Better name : outputPartRowsItr?

done


 On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java, 
  line 96
  https://reviews.apache.org/r/21168/diff/1/?file=576148#file576148line96
 
  Comment made sense. Since like those fields are not present in class 
  any more. Shall we just get rid of this?

this is needed; transient based on BeanInfo (get/set methods) in class


 On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java, 
  line 218
  https://reviews.apache.org/r/21168/diff/1/?file=576148#file576148line218
 
  Better name: canAcceptInputAsStream?

done


 On May 9, 2014, 5:48 p.m., Ashutosh Chauhan wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java, line 449
  https://reviews.apache.org/r/21168/diff/1/?file=576143#file576143line449
 
  Instead of adding noop functions statically, we should put these 
  functions in test/ package and than do add jar for testing. Multiple 
  reasons:
  * Better to isolate test code from production code.
  * It will also exercise add jar functionality for PTF functions for 
  which I am not sure we have coverage.
  * These functions also show up in default list of inbuilt functions. It 
  may confuse user to wonder what good these functions are for. 
  show_functions.q failed because of this.

Agree with your comments on Noop. This was done because for testing we need a 
PTF and Noop has some special short circuit path for Partition handling.
But can we do this as a separate Jira; removing references to Noop in the 
translation code is non trivial work.


- Harish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21168/#review42578
---


On May 14, 2014, 9:21 p.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/21168/
 ---
 
 (Updated May 14, 2014, 9:21 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-6999
 https://issues.apache.org/jira/browse/HIVE-6999
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 There are a set of use cases where the Table Function can operate on a 
 Partition row by row or on a subset(window) of rows as it is being streamed 
 to it.
 Windowing has couple of use cases of this:processing of Rank functions, 
 processing of Window Aggregations.
 But this is a generic concept: any analysis that operates on an Ordered 
 partition maybe able to operate in Streaming mode.
 This patch introduces streaming mode in PTFs and provides the mechanics to 
 handle PTF chains that contain both modes of PTFs.
 Subsequent patches will introduce Streaming mode for Windowing.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 3bb8fa9 
   ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java 4d314b7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java 34aebf0 
   ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopStreaming.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopWithMapStreaming.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 
 1087bbf 
   ql/src/test/queries/clientpositive/ptf_streaming.q PRE-CREATION 
   ql/src/test/results/clientpositive/ptf_streaming.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/21168/diff/
 
 
 Testing
 ---
 
 added new tests
 
 
 Thanks,
 
 Harish Butani
 




[jira] [Updated] (HIVE-6601) alter database commands should support schema synonym keyword

2014-05-15 Thread Abdelrahman Shettia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdelrahman Shettia updated HIVE-6601:
--

Assignee: Abdelrahman Shettia

 alter database commands should support schema synonym keyword
 -

 Key: HIVE-6601
 URL: https://issues.apache.org/jira/browse/HIVE-6601
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Abdelrahman Shettia

 It should be possible to use alter schema  as an alternative to alter 
 database.  But the syntax is not currently supported.
 {code}
 alter schema db1 set owner user x;  
 NoViableAltException(215@[])
 FAILED: ParseException line 1:6 cannot recognize input near 'schema' 'db1' 
 'set' in alter statement
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7040) TCP KeepAlive for HiveServer2

2014-05-15 Thread JIRA
Nicolas Thiébaud created HIVE-7040:
--

 Summary: TCP KeepAlive for HiveServer2
 Key: HIVE-7040
 URL: https://issues.apache.org/jira/browse/HIVE-7040
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Reporter: Nicolas Thiébaud
 Attachments: HIVE-7040.patch

Implement TCP KeepAlive for HiverServer 2 to avoid half open connections.

A setting could be added

{code}
property
  namehive.server2.tcp.keepalive/name
  valuetrue/value
  descriptionWhether to enable TCP keepalive for Hive Server 2/description
/property
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7041) DoubleWritable/ByteWritable should extend their hadoop counterparts

2014-05-15 Thread Jason Dere (JIRA)
Jason Dere created HIVE-7041:


 Summary: DoubleWritable/ByteWritable should extend their hadoop 
counterparts
 Key: HIVE-7041
 URL: https://issues.apache.org/jira/browse/HIVE-7041
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-7041.1.patch

Hive has its own implementations of ByteWritable/DoubleWritable/ShortWritable.  
We cannot replace usage of these classes since they will break 3rd party 
UDFs/SerDes, however we can at least extend from the Hadoop version of these 
classes when possible to avoid duplicate code.

When Hive finally moves to version 1.0 we might want to consider removing use 
of these Hive-specific writables and switching over to using the Hadoop version 
of these classes.

ShortWritable didn't exist in Hadoop until 2.x so it looks like we can't do it 
with this class until 0.20/1.x support is dropped from Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6692) Location for new table or partition should be a write entity

2014-05-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995664#comment-13995664
 ] 

Thejas M Nair commented on HIVE-6692:
-

[~navis] FYI,  I will be working on making changes for SQL std auth to work 
with this soon (in a week or two). And then we can make the change in this jira 
without braking it.



 Location for new table or partition should be a write entity
 

 Key: HIVE-6692
 URL: https://issues.apache.org/jira/browse/HIVE-6692
 Project: Hive
  Issue Type: Task
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6692.1.patch.txt


 Locations for create table and alter table add partitionshould be write 
 entities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Apache Hive 0.13.1

2014-05-15 Thread Sushanth Sowmyan
Hi Folks,

One more final hiccup before actually releasing the RC, I currently do
not seem to be part of the hive unix group on apache that prevents me
from being able to publish maven artifacts or add myself to the KEYS
list. So as to not be blocked on the release process, however, I've
generated the tarballs and signatures for Apache Hive 0.13.1 Release
Candidate 0 here:

https://people.apache.org/~khorgath/releases/0.13.1_RC0/artifacts/

Maven artifacts are currently not as of yet available, and you'll need
to generate them from the source tarball above locally.

Source tag for RC0 is at
https://svn.apache.org/repos/asf/hive/tags/release-0.13.1-rc0/

I also put up my public key over at
https://people.apache.org/~khorgath/releases/0.13.1_RC0/artifacts/khorgath.public_key
in the meanwhile for verification purposes.

Voting has not yet begun because I've still not yet released the maven
artifacts, and the KEYS file has not yet been updated, so I have not
formally called an official voting mail yet - I will do so as soon as
I'm able. Please consider this an early preview for testing, I do not
expect to change these files for RC0 itself.

Thanks!
-Sushanth

On Thu, May 8, 2014 at 2:26 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Hi folks,

 As an update, HIVE-6945 has some 0.13.1 specific test fixes appended
 which make its tests pass, the test that was failing with HIVE-6826 is
 now succeeding(flaky test), and Thejas has confirmed with me that the
 issue with HIVE-6846 is a test problem, not a product problem,
 relating to an incorrect expectation in the test.

 With those resolved, there are no more blockers, and no additional
 jiras that have been requested to be part of this release, so I'll go
 ahead and spin out RC0 now, and will also commit all those patches to
 the 0.13 branch. :)



 On Wed, May 7, 2014 at 7:22 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 After much experimentation with git bisect (which is very powerful),
 I've narrowed down the test failures reported yesterday. The failures
 are appearing from the following:

 HIVE-6945:
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_alter
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_tblproperties
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unset_table_view_property
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_unset_table_property

 HIVE-6846:
 org.apache.hive.service.cli.TestScratchDir.testLocalScratchDirs

 HIVE-6826:
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6


 Of the above, the second jira was already in 0.13.0. I'll comment up
 on those jiras asking the committers involved in those bugs and to
 help debug the issue. If anyone is interested in the git bisect logs
 for these, they're available on
 http://people.apache.org/~khorgath/releases/0.13.1_RC0/test_failures/


 On Tue, May 6, 2014 at 6:41 PM, Sushanth Sowmyan khorg...@gmail.com wrote:
 Also, I wanted to throw in one more bit for those of you that are
 interested in tinkering along :

 http://people.apache.org/~khorgath/releases/0.13.1_RC0/relprep.pl
 http://people.apache.org/~khorgath/releases/0.13.1_RC0/requested_jiras

 This is the script and config file I'm using to generate this release.

 It's very much a hack right now, and I hope to improve it to
 streamline releases in the future, but how it can be used right now is
 this way:

 a) Put it in a hive git repo (and not have any changes that have not
 been committed - this script will checkout a new branch and commit
 things to that branch, so you want to make sure to have a clean repo)
 b) Put the file requested_jiras in that dir as well.
 c) Run the script from there.

 It will check the differences between the branch being released
 (branch-0.13 is hardcoded currently as a global), and looks at all
 the commit logs in trunk that correspond to the jiras requested in the
 requested_jiras file, sorts them in the order they were committed, and
 then checks out a new branch called relprep-branch-0.13-timestamp,
 and attempts to cherry-pick those commits in.

 For some patches, this will not work, so there is an override
 mechanism provided by entries in the requested_jiras file, as can be
 observed in the file I mention above.

 At the end of it, you'll have your 0.13.1 repo reproduction to test
 against if you so desire.

 Known Bugs :

 a) I use system() or die ...;, which is faulty in that the
 die code will never be reached. I need to fix this, but all the
 system calls were working for me, and I'd much rather focus on the
 release now, and improve this script later. This is a TODO
 b) Some patches (those generated with --no-prefix) don't work with
 older versions of git. You'll need a 1.8.x git for them, or you have
 to generate git patches without --no-prefix.



 On Tue, May 6, 2014 at 6:21 PM, Sushanth Sowmyan khorg...@gmail.com 

[jira] [Updated] (HIVE-6999) Add streaming mode to PTFs

2014-05-15 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6999:


Status: Open  (was: Patch Available)

 Add streaming mode to PTFs
 --

 Key: HIVE-6999
 URL: https://issues.apache.org/jira/browse/HIVE-6999
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0, 0.12.0, 0.11.0
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch, HIVE-6999.3.patch


 There are a set of use cases where the Table Function can operate on a 
 Partition row by row or on a subset(window) of rows as it is being streamed 
 to it.
 - Windowing has couple of use cases of this:processing of Rank functions, 
 processing of Window Aggregations.
 - But this is a generic concept: any analysis that operates on an Ordered 
 partition maybe able to operate in Streaming mode.
 This patch introduces streaming mode in PTFs and provides the mechanics to 
 handle PTF chains that contain both modes of PTFs.
 Subsequent patches will introduce Streaming mode for Windowing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 21168: HIVE-6999: Add streaming mode to PTFs

2014-05-15 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21168/
---

(Updated May 14, 2014, 9:21 p.m.)


Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-6999
https://issues.apache.org/jira/browse/HIVE-6999


Repository: hive-git


Description
---

There are a set of use cases where the Table Function can operate on a 
Partition row by row or on a subset(window) of rows as it is being streamed to 
it.
Windowing has couple of use cases of this:processing of Rank functions, 
processing of Window Aggregations.
But this is a generic concept: any analysis that operates on an Ordered 
partition maybe able to operate in Streaming mode.
This patch introduces streaming mode in PTFs and provides the mechanics to 
handle PTF chains that contain both modes of PTFs.
Subsequent patches will introduce Streaming mode for Windowing.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 3bb8fa9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java 4d314b7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java 34aebf0 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopStreaming.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/NoopWithMapStreaming.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 
1087bbf 
  ql/src/test/queries/clientpositive/ptf_streaming.q PRE-CREATION 
  ql/src/test/results/clientpositive/ptf_streaming.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/21168/diff/


Testing
---

added new tests


Thanks,

Harish Butani



[jira] [Commented] (HIVE-6999) Add streaming mode to PTFs

2014-05-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998255#comment-13998255
 ] 

Ashutosh Chauhan commented on HIVE-6999:


+1

 Add streaming mode to PTFs
 --

 Key: HIVE-6999
 URL: https://issues.apache.org/jira/browse/HIVE-6999
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6999.1.patch, HIVE-6999.2.patch


 There are a set of use cases where the Table Function can operate on a 
 Partition row by row or on a subset(window) of rows as it is being streamed 
 to it.
 - Windowing has couple of use cases of this:processing of Rank functions, 
 processing of Window Aggregations.
 - But this is a generic concept: any analysis that operates on an Ordered 
 partition maybe able to operate in Streaming mode.
 This patch introduces streaming mode in PTFs and provides the mechanics to 
 handle PTF chains that contain both modes of PTFs.
 Subsequent patches will introduce Streaming mode for Windowing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5538) Turn on vectorization by default.

2014-05-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-5538:
---

Status: Patch Available  (was: Open)

 Turn on vectorization by default.
 -

 Key: HIVE-5538
 URL: https://issues.apache.org/jira/browse/HIVE-5538
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch, 
 HIVE-5538.4.patch, HIVE-5538.5.patch


   Vectorization should be turned on by default, so that users don't have to 
 specifically enable vectorization. 
   Vectorization code validates and ensures that a query falls back to row 
 mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-7066 started by David Chen.

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen

 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148)
   at 
 

[jira] [Updated] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-7066:
-

Attachment: HIVE-7066.1.patch

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen
 Attachments: HIVE-7066.1.patch


 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476)
   at 
 

[jira] [Commented] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998278#comment-13998278
 ] 

David Chen commented on HIVE-7066:
--

RB: https://reviews.apache.org/r/21471

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen
 Attachments: HIVE-7066.1.patch


 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
   at 
 

[jira] [Updated] (HIVE-7067) Min() and Max() on Timestamp and Date columns for ORC returns wrong results

2014-05-15 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7067:
-

Attachment: HIVE-7067.1.patch

 Min() and Max() on Timestamp and Date columns for ORC returns wrong results
 ---

 Key: HIVE-7067
 URL: https://issues.apache.org/jira/browse/HIVE-7067
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7067.1.patch


 min() and max() of timestamp and date columns of ORC table returns wrong 
 results. The reason for that is when ORC creates object inspectors for date 
 and timestamp it uses JAVA primitive objects as opposed to WRITABLE objects. 
 When get() is performed on java primitive objects, a reference to the 
 underlying object is returned whereas when get() is performed on writable 
 objects, a copy of the underlying object is returned. 
 Fix is to change the object inspector creation to return writable objects for 
 timestamp and date.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6207:


Fix Version/s: (was: 0.13.0)
   0.14.0

 Integrate HCatalog with locking
 ---

 Key: HIVE-6207
 URL: https://issues.apache.org/jira/browse/HIVE-6207
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.14.0


 HCatalog currently ignores any locks created by Hive users.  It should 
 respect the locks Hive creates as well as create locks itself when locking is 
 configured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-05-15 Thread Sergey Shelukhin


 On May 9, 2014, 1:58 a.m., Gunther Hagleitner wrote:
  serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java, line 147
  https://reviews.apache.org/r/18936/diff/13/?file=572150#file572150line147
 
  randomaccess doesn't extend output?

no


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/#review42555
---


On May 1, 2014, 2:29 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18936/
 ---
 
 (Updated May 1, 2014, 2:29 a.m.)
 
 
 Review request for hive, Gopal V and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 604bea7 
   conf/hive-default.xml.template 2552560 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 5fe35a5 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  142bfd8 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bf9d4c1 
   ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 f5d4670 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java b93ea7a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 175d3ab 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
  8854b19 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
 9df425b 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
 64f0be2 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
  008a8db 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
  988959f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
  55b7415 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java e392592 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 eef7656 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java
  d4be78d 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
 3077d75 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
 f7b499b 
   ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 157d072 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
  65e3779 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
  093da55 
   ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be 
   ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
   ql/src/test/queries/clientpositive/tez_union.q f80d94c 
   ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out 8350670 
   ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c 
   ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 
   serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
   serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
   
 serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
  9079b9d 
   
 serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/OutputByteBuffer.java
  1b09d41 
   serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
 5870884 
   
 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
  bab505e 
   serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
 6f344bb 
   serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
   serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java 
 a99c7b4 
   serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 
 435d6c6 
   serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
 82c1263 
   
 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 
 b188c3f 
   
 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java 
 caf3517 
   
 

[jira] [Updated] (HIVE-6862) add DB schema DDL and upgrade 12to13 scripts for MS SQL Server

2014-05-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6862:
---

Fix Version/s: 0.13.1

 add DB schema DDL and upgrade 12to13 scripts for MS SQL Server
 --

 Key: HIVE-6862
 URL: https://issues.apache.org/jira/browse/HIVE-6862
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-6862.2.patch, HIVE-6862.3.patch, HIVE-6862.patch


 need to add a unifed 0.13 script and a separate script for ACID support
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6919) hive sql std auth select query fails on partitioned tables

2014-05-15 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6919:
---

Fix Version/s: 0.13.1

 hive sql std auth select query fails on partitioned tables
 --

 Key: HIVE-6919
 URL: https://issues.apache.org/jira/browse/HIVE-6919
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-6919.1.patch


 {code}
 analyze table studentparttab30k partition (ds) compute statistics;
 Error: Error while compiling statement: FAILED: HiveAccessControlException 
 Permission denied. Principal [name=hadoopqa, type=USER] does not have 
 following privileges on Object [type=PARTITION, name=null] : [SELECT] 
 (state=42000,code=4)
 {code}
 Sql std auth is supposed to ignore partition level objects for privilege 
 checks, but that is not working as intended.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6893) out of sequence error in HiveMetastore server

2014-05-15 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6893:


Fix Version/s: (was: 0.13.0)
   0.14.0

 out of sequence error in HiveMetastore server
 -

 Key: HIVE-6893
 URL: https://issues.apache.org/jira/browse/HIVE-6893
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Romain Rigaux
Assignee: Naveen Gangam
 Fix For: 0.14.0

 Attachments: HIVE-6893.1.patch


 Calls listing databases or tables fail. It seems to be a concurrency problem.
 {code}
 014-03-06 05:34:00,785 ERROR hive.log: 
 org.apache.thrift.TApplicationException: get_databases failed: out of 
 sequence response
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648)
 at 
 org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66)
 at 
 org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278)
 at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582)
 at 
 org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57)
 at com.sun.proxy.$Proxy9.getSchemas(Unknown Source)
 at 
 org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192)
 at 
 org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433)
 at 
 org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
 org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:724)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7037) Add additional tests for transform clauses with Tez

2014-05-15 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-7037:
-

Status: Patch Available  (was: Open)

 Add additional tests for transform clauses with Tez
 ---

 Key: HIVE-7037
 URL: https://issues.apache.org/jira/browse/HIVE-7037
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-7037.1.patch


 Enabling some q tests for Tez wrt to ScriptOperator/Stream/Transform.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19984: Beeline should accept -i option to Initializing a SQL file

2014-05-15 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19984/#review42621
---


Can you also include a unit test for this ? It can go into 
TestBeeLineWithArgs.java


- Thejas Nair


On May 7, 2014, 4:10 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19984/
 ---
 
 (Updated May 7, 2014, 4:10 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6561
 https://issues.apache.org/jira/browse/HIVE-6561
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Hive CLI has -i option. From Hive CLI help:
 {code}
 ...
  -i filenameInitialization SQL file
 ...
 {code}
 
 However, Beeline has no such option:
 {code}
 xzhang@xzlt:~/apa/hive3$ 
 ./packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/bin/beeline
  -u jdbc:hive2:// -i hive.rc
 ...
 Connected to: Apache Hive (version 0.14.0-SNAPSHOT)
 Driver: Hive JDBC (version 0.14.0-SNAPSHOT)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -i (No such file or directory)
 Property url is required
 Beeline version 0.14.0-SNAPSHOT by Apache Hive
 ...
 {code}
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/BeeLine.java 5773109 
   beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java 44cabdf 
   beeline/src/java/org/apache/hive/beeline/Commands.java 493f963 
   beeline/src/main/resources/BeeLine.properties 697c29a 
 
 Diff: https://reviews.apache.org/r/19984/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Updated] (HIVE-7000) Several issues with javadoc generation

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7000:
---

Status: Patch Available  (was: Open)

 Several issues with javadoc generation
 --

 Key: HIVE-7000
 URL: https://issues.apache.org/jira/browse/HIVE-7000
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-7000.1.patch


 1.
 Ran 'mvn  javadoc:javadoc -Phadoop-2'.  Encountered several issues
 - Generated classes are included in the javadoc
 - generation fails in the top level hcatalog folder because its src folder  
 contains  no java files.
 Patch attached to fix these issues.
 2.
 Tried mvn javadoc:aggregate -Phadoop-2 
 - cannot get an aggregated javadoc for all of hive
 - tried setting 'aggregate' parameter to true. Didn't work
 There are several questions in StackOverflow about multiple project javadoc. 
 Seems like this is broken. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5342) Remove pre hadoop-0.20.0 related codes

2014-05-15 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5342:
-

Status: Patch Available  (was: Open)

 Remove pre hadoop-0.20.0 related codes
 --

 Key: HIVE-5342
 URL: https://issues.apache.org/jira/browse/HIVE-5342
 Project: Hive
  Issue Type: Task
Reporter: Navis
Assignee: Jason Dere
Priority: Trivial
 Attachments: D13047.1.patch, HIVE-5342.1.patch, HIVE-5342.2.patch


 Recently, we discussed not supporting hadoop-0.20.0. If it would be done like 
 that or not, 0.17 related codes would be removed before that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable

2014-05-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997714#comment-13997714
 ] 

Xuefu Zhang commented on HIVE-7049:
---

[~kamrul] If Hive can support the AVRO schema resolutions you mentioned, I 
don't see any obstacles. However, the fix in your patch seems having a problem 
with decimal, which may need more deliberation.

 Unable to deserialize AVRO data when file schema and record schema are 
 different and nullable
 -

 Key: HIVE-7049
 URL: https://issues.apache.org/jira/browse/HIVE-7049
 Project: Hive
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-7049.1.patch


 It mainly happens when 
 1 )file schema and record schema are not same
 2 ) Record schema is nullable  but file schema is not.
 The potential code location is at class AvroDeserialize
  
 {noformat}
  if(AvroSerdeUtils.isNullableType(recordSchema)) {
   return deserializeNullableUnion(datum, fileSchema, recordSchema, 
 columnType);
 }
 {noformat}
 In the above code snippet, recordSchema is verified if it is nullable. But 
 the file schema is not checked.
 I tested with these values:
 {noformat}
 recordSchema= [null,string]
 fielSchema= string
 {noformat}
 And i got the following exception line numbers might not be the same due to 
 mu debugged code version.
 {noformat}
 org.apache.avro.AvroRuntimeException: Not a union: string 
 at org.apache.avro.Schema.getTypes(Schema.java:272)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6374) Hive job submitted with non-default name node (fs.default.name) doesn't process locations properly

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6374:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Benjamin!

 Hive job submitted with non-default name node (fs.default.name) doesn't 
 process locations properly 
 ---

 Key: HIVE-6374
 URL: https://issues.apache.org/jira/browse/HIVE-6374
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: Any
Reporter: Benjamin Zhitomirsky
Assignee: Benjamin Zhitomirsky
 Fix For: 0.14.0

 Attachments: Design of the fix HIVE-6374.docx, hive-6374.1.patch, 
 hive-6374.3.patch, hive-6374.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 Create table/index/database and add partition DDL doesn't work properly if 
 all following conditions are true:
 - Metastore service is used
 - fs.default.name is specified and it differs from the default one
 - Location is not specified or specified as a not fully qualified URI
 The root cause of this behavior is that Hive client doesn't pass 
 configuration context to the metastore services which tries to resolve the 
 paths. The fix is it too resolve the path in the Hive client if 
 fs.default.name is specified and it differs from the default one (it is must 
 easier then start passing the context, which would be a major change).
 The CR will submitted shortly after tests are done



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename

2014-05-15 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6938:
---

Attachment: HIVE-6938.2.patch

Reuploading the exact same patch to trigger precommits.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7012:
---

Assignee: Navis
  Status: Open  (was: Patch Available)

reduce_deduplicate_extended.q, ppd.q, fetch_aggregation.q failures might be 
relevant. [~navis] can you take a look?

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
Assignee: Navis
 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0 

[jira] [Created] (HIVE-7063) Optimize for the Top N within a Group use case

2014-05-15 Thread Harish Butani (JIRA)
Harish Butani created HIVE-7063:
---

 Summary: Optimize for the Top N within a Group use case
 Key: HIVE-7063
 URL: https://issues.apache.org/jira/browse/HIVE-7063
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani


It is common to rank within a Group/Partition and then only return the Top N 
entries within each Group.
With Streaming mode for Windowing, we should push the post filter on the rank 
into the Windowing processing as a Limit expression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6976) Show query id only when there's jobs on the cluster

2014-05-15 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6976:
-

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the review Sergey!

 Show query id only when there's jobs on the cluster
 ---

 Key: HIVE-6976
 URL: https://issues.apache.org/jira/browse/HIVE-6976
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6976.1.patch


 No need to print the query id for local-only execution.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names

2014-05-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995623#comment-13995623
 ] 

Hive QA commented on HIVE-6187:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12644392/HIVE-6187.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5504 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/178/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/178/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12644392

 Add test to verify that DESCRIBE TABLE works with quoted table names
 

 Key: HIVE-6187
 URL: https://issues.apache.org/jira/browse/HIVE-6187
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Andy Mok
 Attachments: HIVE-6187.1.patch


 Backticks around tables named after special keywords, such as items, allow us 
 to create, drop, and alter the table. For example
 {code:sql}
 CREATE TABLE foo.`items` (bar INT);
 DROP TABLE foo.`items`;
 ALTER TABLE `items` RENAME TO `items_`;
 {code}
 However, we cannot call
 {code:sql}
 DESCRIBE foo.`items`;
 DESCRIBE `items`;
 {code}
 The DESCRIBE query does not permit backticks to surround table names. The 
 error returned is
 {code:sql}
 FAILED: SemanticException [Error 10001]: Table not found `items`
 {code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5810) create a function add_date as exists in mysql

2014-05-15 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5810:
-

Status: Open  (was: Patch Available)

 create a function add_date   as exists in mysql 
 

 Key: HIVE-5810
 URL: https://issues.apache.org/jira/browse/HIVE-5810
 Project: Hive
  Issue Type: Improvement
Reporter: Anandha L Ranganathan
Assignee: Anandha L Ranganathan
 Attachments: HIVE-5810.2.patch, HIVE-5810.patch

   Original Estimate: 40h
  Remaining Estimate: 40h

 MySQL has ADDDATE(date,INTERVAL expr unit).
 Similarly in Hive we can have  (date,unit,expr). 
 Here Unit is DAY/Month/Year
 For example,
 add_date('2013-11-09','DAY',2) will return 2013-11-11.
 add_date('2013-11-09','Month',2) will return 2014-01-09.
 add_date('2013-11-09','Year',2) will return 2014-11-11.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7061) sql std auth - insert queries without overwrite should not require delete privileges

2014-05-15 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998147#comment-13998147
 ] 

Thejas M Nair commented on HIVE-7061:
-

WriteEntity types are already in hive, as part of  HIVE-5843.

 sql std auth - insert queries without overwrite should not require delete 
 privileges
 

 Key: HIVE-7061
 URL: https://issues.apache.org/jira/browse/HIVE-7061
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair

 Insert queries can do the equivalent of delete and insert of all rows of a 
 table or partition, if the overwrite keyword is used. As a result DELETE 
 privilege is applicable to such queries.
 However, SQL Standard auth requires DELETE privilege even for queries that 
 don't have the overwrite keyword.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Hive and MR2

2014-05-15 Thread Azuryy Yu
Hi,
I am using hive-0.13.0 and hadoop-2.4.0,

why I must set 'mapreduce.jobtracker.address' in yarn-site.xml? otherwise,
there are exceptions and job failed.

And, 'mapreduce.jobtracker.address' can be set to any value.

The following messages are gened without set 'mapreduce.jobtracker.address'.

Job output on the console:
Execution log at:
/tmp/test/test_20140507180505_bcd4d89f-017c-4cf4-81a3-5fa619de0ad0.log
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 1
2014-05-07 18:06:25,782 null map = 0%,  reduce = 0%
2014-05-07 18:06:33,699 null map = 100%,  reduce = 0%
2014-05-07 18:06:34,774 null map = 0%,  reduce = 0%
2014-05-07 18:06:49,222 null map = 100%,  reduce = 100%
Ended Job = job_1399453944131_0006 with errors
Error during job, obtaining debugging information...

Container error:
2014-05-07 18:06:33,634 INFO [main]
org.apache.hadoop.hive.ql.exec.Utilities: No plan file found:
file:/tmp/test/hive_2014-05-07_18-06-08_349_1526907284076641211-1/-mr-10001/0a1c9ebe-cdb0-4adc-9e93-8f176019f19a/map.xml
2014-05-07 18:06:33,635 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:255)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:437)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:430)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)


[jira] [Created] (HIVE-7047) Support schema keyword in alter database statements

2014-05-15 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-7047:
---

 Summary: Support schema keyword in alter database statements
 Key: HIVE-7047
 URL: https://issues.apache.org/jira/browse/HIVE-7047
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, SQL
Affects Versions: 0.13.0
Reporter: Thejas M Nair


To be consistent with rest of the syntax, the alter database statements should 
also support SCHEMA keyword along with DATABASE keyword.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode

2014-05-15 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-7054:
-

Status: Patch Available  (was: Open)

Thanks [~rusanu] for quick review on the review board! Attached patch attempts 
to incorporate that. Also minor update to the qtest results output file.

 Support ELT UDF in vectorized mode
 --

 Key: HIVE-7054
 URL: https://issues.apache.org/jira/browse/HIVE-7054
 Project: Hive
  Issue Type: New Feature
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-7054.2.patch, HIVE-7054.patch


 Implement support for ELT udf in vectorized execution mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work stopped] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-15 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-7066 stopped by David Chen.

 hive-exec jar is missing avro-mapred
 

 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen
Assignee: David Chen

 Running a simple query that reads an Avro table caused the following 
 exception to be thrown on the cluster side:
 {code}
 java.lang.RuntimeException: 
 org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:394)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
 java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
 Serialization trace:
 outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
 aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
   at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
   ... 13 more
 Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
 class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
   at 
 org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
   at 
 org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476)
   at 
 org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148)
   at 
 

[jira] [Created] (HIVE-7068) Integrate AccumuloStorageHandler

2014-05-15 Thread Josh Elser (JIRA)
Josh Elser created HIVE-7068:


 Summary: Integrate AccumuloStorageHandler
 Key: HIVE-7068
 URL: https://issues.apache.org/jira/browse/HIVE-7068
 Project: Hive
  Issue Type: New Feature
Reporter: Josh Elser


[Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
HBase. Some [initial 
work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
to support querying an Accumulo table using Hive already. It is not a complete 
solution as, most notably, the current implementation presently lacks support 
for INSERTs.

I would like to polish up the AccumuloStorageHandler (presently based on 0.10), 
implement missing basic functionality and compare it to the HBaseStorageHandler 
(to ensure that we follow the same general usage patterns).

I've also been in communication with [~bfem] (the initial author) who expressed 
interest in working on this again. I hope to coordinate efforts with him.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE

2014-05-15 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7050:
-

Attachment: HIVE-7050.3.patch

Addressed [~xuefuz]'s review comments. Left reply in RB. RB is flaky now will 
update the patch in RB later.

 Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
 -

 Key: HIVE-7050
 URL: https://issues.apache.org/jira/browse/HIVE-7050
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch, HIVE-7050.3.patch


 There is currently no way to display the column level stats from hive CLI. It 
 will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable

2014-05-15 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998371#comment-13998371
 ] 

Mohammad Kamrul Islam commented on HIVE-7049:
-

Thanks @xzhang.
However, the fix in your patch seems having a problem with decimal, which may 
need more deliberation.

What is the (potential) problem in decimal?
Any proposal what to do to address the decimal problem?



 Unable to deserialize AVRO data when file schema and record schema are 
 different and nullable
 -

 Key: HIVE-7049
 URL: https://issues.apache.org/jira/browse/HIVE-7049
 Project: Hive
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-7049.1.patch


 It mainly happens when 
 1 )file schema and record schema are not same
 2 ) Record schema is nullable  but file schema is not.
 The potential code location is at class AvroDeserialize
  
 {noformat}
  if(AvroSerdeUtils.isNullableType(recordSchema)) {
   return deserializeNullableUnion(datum, fileSchema, recordSchema, 
 columnType);
 }
 {noformat}
 In the above code snippet, recordSchema is verified if it is nullable. But 
 the file schema is not checked.
 I tested with these values:
 {noformat}
 recordSchema= [null,string]
 fielSchema= string
 {noformat}
 And i got the following exception line numbers might not be the same due to 
 mu debugged code version.
 {noformat}
 org.apache.avro.AvroRuntimeException: Not a union: string 
 at org.apache.avro.Schema.getTypes(Schema.java:272)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7063) Optimize for the Top N within a Group use case

2014-05-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998389#comment-13998389
 ] 

Gopal V commented on HIVE-7063:
---

This would be exceptionally useful - I have seen at least two implementations 
of TOPN UDAFs for this.

 Optimize for the Top N within a Group use case
 --

 Key: HIVE-7063
 URL: https://issues.apache.org/jira/browse/HIVE-7063
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani

 It is common to rank within a Group/Partition and then only return the Top N 
 entries within each Group.
 With Streaming mode for Windowing, we should push the post filter on the rank 
 into the Windowing processing as a Limit expression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6994) parquet-hive createArray strips null elements

2014-05-15 Thread Justin Coffey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Coffey updated HIVE-6994:


Attachment: HIVE-6994.3.patch

The failed tests are unrelated to the patch--submitting a rebased against the 
trunk and retested patch.

[~szehon], new rb link here: https://reviews.apache.org/r/21430/

hope we're good :) 

 parquet-hive createArray strips null elements
 -

 Key: HIVE-6994
 URL: https://issues.apache.org/jira/browse/HIVE-6994
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Justin Coffey
Assignee: Justin Coffey
 Fix For: 0.14.0

 Attachments: HIVE-6994-1.patch, HIVE-6994.2.patch, HIVE-6994.3.patch, 
 HIVE-6994.patch


 The createArray method in ParquetHiveSerDe strips null values from resultant 
 ArrayWritables.
 tracked here as well: https://github.com/Parquet/parquet-mr/issues/377



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >