[jira] [Updated] (HIVE-2390) Expand support for union types

2014-09-10 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-2390:
-
Labels: TODOC14 uniontype  (was: uniontype)

 Expand support for union types
 --

 Key: HIVE-2390
 URL: https://issues.apache.org/jira/browse/HIVE-2390
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Jakob Homan
Assignee: Suma Shivaprasad
  Labels: TODOC14, uniontype
 Fix For: 0.14.0

 Attachments: HIVE-2390.1.patch, HIVE-2390.patch


 When the union type was introduced, full support for it wasn't provided.  For 
 instance, when working with a union that gets passed to LazyBinarySerde: 
 {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8019) Missing commit from trunk : `export/import statement update`

2014-09-10 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-8019:

Attachment: HIVE-8019.2.patch

HIVE-8019.2.patch - fixes test failures


 Missing commit from trunk : `export/import statement update`
 

 Key: HIVE-8019
 URL: https://issues.apache.org/jira/browse/HIVE-8019
 Project: Hive
  Issue Type: Bug
  Components: Import/Export
Affects Versions: 0.14.0
Reporter: Mohit Sabharwal
Assignee: Thejas M Nair
Priority: Blocker
 Attachments: HIVE-8019.1.patch, HIVE-8019.2.patch


 Noticed that commit 1882de7810fc55a2466dd4cbe74ed67bb41cb667 exists in 0.13 
 branch, but not it trunk. 
 https://github.com/apache/hive/commit/1882de7810fc55a2466dd4cbe74ed67bb41cb667
 {code}
 (trunk) $ git branch -a --contains 1882de7810fc55a2466dd4cbe74ed67bb41cb667
 remotes/origin/branch-0.13
 {code}
 I looked through some of the changes in this commit and don't see those in 
 trunk.  Nor do I see a commit that reverts these changes in trunk.
 [~thejas], should we port this over to trunk ? 
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7694) SMB join on tables differing by number of sorted by columns with same join prefix fails

2014-09-10 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-7694:
--
Release Note: SMB join on tables differing by number of sorted by columns 
with same join prefix  (was: I just committed this. Thanks Suma!)

 SMB join on tables differing by number of sorted by columns with same join 
 prefix fails
 ---

 Key: HIVE-7694
 URL: https://issues.apache.org/jira/browse/HIVE-7694
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7694.1.patch, HIVE-7694.2.patch, HIVE-7694.patch


 For eg: If two tables T1 sorted by (a, b, c) clustered by a and T2 sorted by 
 (a) and clustered by (a) are joined, the following exception is seen
 {noformat}
 14/08/11 09:09:38 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
 1, Size: 1
 java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
 at java.util.ArrayList.RangeCheck(ArrayList.java:547)
 at java.util.ArrayList.get(ArrayList.java:322)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.checkSortColsAndJoinCols(AbstractSMBJoinProc.java:378)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.isEligibleForBucketSortMergeJoin(AbstractSMBJoinProc.java:352)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.canConvertBucketMapJoinToSMBJoin(AbstractSMBJoinProc.java:119)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapjoinProc.process(SortedMergeBucketMapjoinProc.java:51)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapJoinOptimizer.transform(SortedMergeBucketMapJoinOptimizer.java:109)
 at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:146)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9305)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:64)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7694) SMB join on tables differing by number of sorted by columns with same join prefix fails

2014-09-10 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-7694:
--
  Resolution: Fixed
Release Note: I just committed this. Thanks Suma!
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 SMB join on tables differing by number of sorted by columns with same join 
 prefix fails
 ---

 Key: HIVE-7694
 URL: https://issues.apache.org/jira/browse/HIVE-7694
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7694.1.patch, HIVE-7694.2.patch, HIVE-7694.patch


 For eg: If two tables T1 sorted by (a, b, c) clustered by a and T2 sorted by 
 (a) and clustered by (a) are joined, the following exception is seen
 {noformat}
 14/08/11 09:09:38 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
 1, Size: 1
 java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
 at java.util.ArrayList.RangeCheck(ArrayList.java:547)
 at java.util.ArrayList.get(ArrayList.java:322)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.checkSortColsAndJoinCols(AbstractSMBJoinProc.java:378)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.isEligibleForBucketSortMergeJoin(AbstractSMBJoinProc.java:352)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.canConvertBucketMapJoinToSMBJoin(AbstractSMBJoinProc.java:119)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapjoinProc.process(SortedMergeBucketMapjoinProc.java:51)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapJoinOptimizer.transform(SortedMergeBucketMapJoinOptimizer.java:109)
 at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:146)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9305)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:64)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7694) SMB join on tables differing by number of sorted by columns with same join prefix fails

2014-09-10 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128116#comment-14128116
 ] 

Amareshwari Sriramadasu commented on HIVE-7694:
---

I just committed this. Thanks Suma!

 SMB join on tables differing by number of sorted by columns with same join 
 prefix fails
 ---

 Key: HIVE-7694
 URL: https://issues.apache.org/jira/browse/HIVE-7694
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7694.1.patch, HIVE-7694.2.patch, HIVE-7694.patch


 For eg: If two tables T1 sorted by (a, b, c) clustered by a and T2 sorted by 
 (a) and clustered by (a) are joined, the following exception is seen
 {noformat}
 14/08/11 09:09:38 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 
 1, Size: 1
 java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
 at java.util.ArrayList.RangeCheck(ArrayList.java:547)
 at java.util.ArrayList.get(ArrayList.java:322)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.checkSortColsAndJoinCols(AbstractSMBJoinProc.java:378)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.isEligibleForBucketSortMergeJoin(AbstractSMBJoinProc.java:352)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.canConvertBucketMapJoinToSMBJoin(AbstractSMBJoinProc.java:119)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapjoinProc.process(SortedMergeBucketMapjoinProc.java:51)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeBucketMapJoinOptimizer.transform(SortedMergeBucketMapJoinOptimizer.java:109)
 at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:146)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9305)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:64)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25468: HIVE-7777: add CSVSerde support

2014-09-10 Thread cheng xu


 On Sept. 9, 2014, 3:07 p.m., Brock Noland wrote:
  serde/pom.xml, line 73
  https://reviews.apache.org/r/25468/diff/1/?file=683466#file683466line73
 
  These should only be indented by two spaces, not four. Have you tried 
  submitting an MR job on a cluster with this patch? The reason I ask is that 
  I think the serde must be in here:
  
  https://github.com/apache/hive/blob/trunk/ql/pom.xml#L563
  
  for it to be available to MR jobs.

I think it does not need to add the class alone because 
org.apache.hive:hive-serde was already included. BTW, I do a test as the 
following steps: 

(1) create a table with the csv format:
create table csv_table(a string, b string)
  row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES(
 separatorChar = ,,
   quoteChar = ',
   escapeChar= \
) stored as textfile;

(2) load data by:
load data local inpath /root/workspace/data overwrite into table csv_table;

(3) cat /root/workspace/data:
aa,bb
dd,cc

(4) select a from csv_table:
+-+--+
|  a  |
+-+--+
| aa  |
| dd  |
+-+--+

If I am missing anything, please help figure it out. Thanks!


- cheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25468/#review52723
---


On Sept. 9, 2014, 2:16 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25468/
 ---
 
 (Updated Sept. 9, 2014, 2:16 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-
 https://issues.apache.org/jira/browse/HIVE-
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-: add CSVSerde support
 
 
 Diffs
 -
 
   serde/pom.xml f8bcc830cfb298d739819db8fbaa2f98f221ccf3 
   serde/src/java/org/apache/hadoop/hive/serde2/CSVSerde.java PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/TestCSVSerde.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25468/diff/
 
 
 Testing
 ---
 
 Unit test
 
 
 Thanks,
 
 cheng xu
 




Re: Remove hive.metastore.metadb.dir from HiveConf.java?

2014-09-10 Thread Lefty Leverenz
Nevermind, it's already been done by HIVE-1879
https://issues.apache.org/jira/browse/HIVE-1879.  Sorry about the spam.

Thanks Lars.

-- Lefty

On Wed, Sep 10, 2014 at 1:35 AM, Lefty Leverenz leftylever...@gmail.com
wrote:

 Lars Francke updated the Metastore Admin doc
 https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-AdditionalConfigurationParameters
 as follows:

 hive.metastore.metadb.dir   The location of filestore metadata base
 directory. (Functionality removed in 0.4.0 with HIVE-143
 https://issues.apache.org/jira/browse/HIVE-143)


 But hive.metastore.metadb.dir still exists in HiveConf.java.  As I'm
 making various other fixes to HiveConf.java in HIVE-6586
 https://issues.apache.org/jira/browse/HIVE-6586, should I remove this
 obsolete parameter?

 -- Lefty



[jira] [Updated] (HIVE-2390) Add UNIONTYPE serialization support to LazyBinarySerDe

2014-09-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2390:
-
Summary: Add UNIONTYPE serialization support to LazyBinarySerDe  (was: 
Expand support for union types)

 Add UNIONTYPE serialization support to LazyBinarySerDe
 --

 Key: HIVE-2390
 URL: https://issues.apache.org/jira/browse/HIVE-2390
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Jakob Homan
Assignee: Suma Shivaprasad
  Labels: TODOC14, uniontype
 Fix For: 0.14.0

 Attachments: HIVE-2390.1.patch, HIVE-2390.patch


 When the union type was introduced, full support for it wasn't provided.  For 
 instance, when working with a union that gets passed to LazyBinarySerde: 
 {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2390) Add UNIONTYPE serialization support to LazyBinarySerDe

2014-09-10 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128124#comment-14128124
 ] 

Carl Steinbach commented on HIVE-2390:
--

I updated the description of this ticket to accurately reflect the change that 
was made in this patch.

My impression is that this patch doesn't really change the situation in Hive 
with respect to UNIONTYPEs -- this feature is still unusable. If I'm wrong 
about this I would appreciate someone setting me straight.


 Add UNIONTYPE serialization support to LazyBinarySerDe
 --

 Key: HIVE-2390
 URL: https://issues.apache.org/jira/browse/HIVE-2390
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Jakob Homan
Assignee: Suma Shivaprasad
  Labels: TODOC14, uniontype
 Fix For: 0.14.0

 Attachments: HIVE-2390.1.patch, HIVE-2390.patch


 When the union type was introduced, full support for it wasn't provided.  For 
 instance, when working with a union that gets passed to LazyBinarySerde: 
 {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7086) TestHiveServer2.testConnection is failing on trunk

2014-09-10 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128134#comment-14128134
 ] 

Vaibhav Gumashta commented on HIVE-7086:


[~ashutoshc] The failed test looks flaky. Does this look good now?

 TestHiveServer2.testConnection is failing on trunk
 --

 Key: HIVE-7086
 URL: https://issues.apache.org/jira/browse/HIVE-7086
 Project: Hive
  Issue Type: Test
  Components: HiveServer2, JDBC
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-7086.1.patch, HIVE-7086.2.patch, HIVE-7086.3.patch


 Able to repro locally on fresh checkout



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7892) Thrift Set type not working with Hive

2014-09-10 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128138#comment-14128138
 ] 

Amareshwari Sriramadasu commented on HIVE-7892:
---

Code changes look fine. Can you update the test output for 
convert_enum_to_string.q and upload the patch?

 Thrift Set type not working with Hive
 -

 Key: HIVE-7892
 URL: https://issues.apache.org/jira/browse/HIVE-7892
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Satish Mittal
Assignee: Satish Mittal
 Attachments: HIVE-7892.patch.txt


 Thrift supports List, Map and Struct complex types, which get mapped to 
 Array, Map and Struct complex types in Hive respectively. However thrift Set 
 type doesn't seem to be working. 
 Here is an example thrift struct:
 {noformat}
 namespace java sample.thrift
 struct setrow {
 1: required seti32 ids,
 2: required string name,
 }
 {noformat}
 A Hive table is created with ROW FORMAT SERDE 
 'org.apache.hadoop.hive.serde2.thrift.ThriftDeserializer' WITH 
 SERDEPROPERTIES ('serialization.class'='sample.thrift.setrow', 
 'serialization.format'='org.apache.thrift.protocol.TBinaryProtocol').
 Describing the table shows:
 {noformat}
 hive describe settable; 
 OK
 ids   structfrom deserializer   
 namestringfrom deserializer
 {noformat}
 Issuing a select query on set column throws SemanticException:
 {noformat}
 hive select ids from settable;
 FAILED: SemanticException java.lang.IllegalArgumentException: Error: name 
 expected at the position 7 of 'struct' but '' is found.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25468: HIVE-7777: add CSVSerde support

2014-09-10 Thread cheng xu


 On Sept. 9, 2014, 8:49 a.m., Lars Francke wrote:
  Looks good apart from minor comments.
  
  Maybe add a test for the Serialization part?
  https://issues.apache.org/jira/browse/HIVE-5976 integration might be nice: 
  STORED AS CSV. Unfortunately there's no documentation yet so I'm not sure 
  if it's feasible.

Good point! Why not file it in a new jira ticket as a future work?


- cheng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25468/#review52688
---


On Sept. 9, 2014, 2:16 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25468/
 ---
 
 (Updated Sept. 9, 2014, 2:16 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-
 https://issues.apache.org/jira/browse/HIVE-
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-: add CSVSerde support
 
 
 Diffs
 -
 
   serde/pom.xml f8bcc830cfb298d739819db8fbaa2f98f221ccf3 
   serde/src/java/org/apache/hadoop/hive/serde2/CSVSerde.java PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/TestCSVSerde.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25468/diff/
 
 
 Testing
 ---
 
 Unit test
 
 
 Thanks,
 
 cheng xu
 




[jira] [Updated] (HIVE-7777) add CSV support for Serde

2014-09-10 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-:
---
Attachment: HIVE-.1.patch

 add CSV support for Serde
 -

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-.1.patch, HIVE-.patch, csv-serde-master.zip


 There is no official support for csvSerde for hive while there is an open 
 source project in github(https://github.com/ogrodnek/csv-serde). CSV is of 
 high frequency in use as a data format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7935) Support dynamic service discovery for HiveServer2

2014-09-10 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128148#comment-14128148
 ] 

Lefty Leverenz commented on HIVE-7935:
--

+1 for parameter descriptions in HiveConf.java (although I'm surprised to see 
parameter values represented in the form $\{hive.param.xyz\}).

 Support dynamic service discovery for HiveServer2
 -

 Key: HIVE-7935
 URL: https://issues.apache.org/jira/browse/HIVE-7935
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2, JDBC
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-7935.1.patch, HIVE-7935.2.patch, HIVE-7935.3.patch


 To support Rolling Upgrade / HA, we need a mechanism by which a JDBC client 
 can dynamically resolve an HiveServer2 to connect to.
 *High Level Design:* 
 Whether, dynamic service discovery is supported or not, can be configured by 
 setting HIVE_SERVER2_SUPPORT_DYNAMIC_SERVICE_DISCOVERY. ZooKeeper is used to 
 support this.
 * When an instance of HiveServer2 comes up, it adds itself as a znode to 
 ZooKeeper under a configurable namespace (HIVE_SERVER2_ZOOKEEPER_NAMESPACE).
 * A JDBC/ODBC client now specifies the ZooKeeper ensemble in its connection 
 string, instead of pointing to a specific HiveServer2 instance. The JDBC 
 driver, uses the ZooKeeper ensemble to pick an instance of HiveServer2 to 
 connect for the entire session.
 * When an instance is removed from ZooKeeper, the existing client sessions 
 continue till completion. When the last client session completes, the 
 instance shuts down.
 * All new client connection pick one of the available HiveServer2 uris from 
 ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8022) Recursive root scratch directory creation is not using hdfs umask properly

2014-09-10 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128152#comment-14128152
 ] 

Vaibhav Gumashta commented on HIVE-8022:


Just ran the failed test - it looks like a flaky test which has been failing on 
other precommits. I'll commit this tomorrow.

Thanks for the review [~thejas]]!

 Recursive root scratch directory creation is not using hdfs umask properly 
 ---

 Key: HIVE-8022
 URL: https://issues.apache.org/jira/browse/HIVE-8022
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-8022.1.patch, HIVE-8022.2.patch, HIVE-8022.3.patch


 Changes made in HIVE-6847 removed the helper methods that were added 
 HIVE-7001 to get around this problem. Since the root scratch dir must be 
 writable by all, its creation should use those methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8030) NullPointerException on getSchemas

2014-09-10 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128153#comment-14128153
 ] 

Lars Francke commented on HIVE-8030:


I had a typo in my comment from yesterday. I meant that it looks very similar 
to HIVE-2069.

Here's the code as of version 0.13.1: 
https://github.com/apache/hive/blob/release-0.13.1/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveMetaDataResultSet.java#L32

As you can see there's no ArrayList at line 32 and even if there were all of 
them are guarded by null checks.

Are you 100% sure you are using Hive 0.13.1?

 NullPointerException on getSchemas
 --

 Key: HIVE-8030
 URL: https://issues.apache.org/jira/browse/HIVE-8030
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, JDBC
Affects Versions: 0.13.1
 Environment: Linux (Ubuntu 12.04)
Reporter: Shiv Prakash
  Labels: hadoop
 Fix For: 0.13.1


 java.lang.NullPointerException
   at java.util.ArrayList.init(ArrayList.java:164)
   at 
 org.apache.hadoop.hive.jdbc.HiveMetaDataResultSet.init(HiveMetaDataResultSet.java:32)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData$3.init(HiveDatabaseMetaData.java:482)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:481)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:476)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$DatabaseMetaDataInvocationHandler.invoke(DriverProxyInvocationChain.java:368)
   at com.sun.proxy.$Proxy20.getSchemas(Unknown Source)
   at org.pentaho.di.core.database.Database.getSchemas(Database.java:3857)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.getSchemaNames(TableOutputDialog.java:1036)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.access$2400(TableOutputDialog.java:94)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog$24.widgetSelected(TableOutputDialog.java:863)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.open(TableOutputDialog.java:884)
   at 
 org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:124)
   at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:8648)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:3020)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.mouseDoubleClick(TransGraph.java:737)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1297)
   at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7801)
   at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9130)
   at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:638)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.pentaho.commons.launcher.Launcher.main(Launcher.java:151)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-6747) TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing

2014-09-10 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-6747.

Resolution: Duplicate

Duplicate.

 TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync is failing
 ---

 Key: HIVE-6747
 URL: https://issues.apache.org/jira/browse/HIVE-6747
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-952) Support analytic NTILE function

2014-09-10 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-952:

Fix Version/s: 0.11.0

 Support analytic NTILE function
 ---

 Key: HIVE-952
 URL: https://issues.apache.org/jira/browse/HIVE-952
 Project: Hive
  Issue Type: New Feature
  Components: OLAP, Query Processor, UDF
Reporter: Carl Steinbach
 Fix For: 0.11.0


 The NTILE function divides a set of ordered rows into equally sized buckets 
 and assigns a bucket number to each row.
 Useful for calculating tertiles, quartiles, quintiles, etc.
 Example:
 {code:sql}
 SELECT last_name, salary,
 NTILE(4) OVER (ORDER BY salary DESC) AS quartile
 FROM employees
 WHERE department_id = 100;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7892) Thrift Set type not working with Hive

2014-09-10 Thread Satish Mittal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Mittal updated HIVE-7892:

Attachment: HIVE-7892.patch.1.txt

Attaching updated patch. The test convert_enum_to_string.q works with existing 
MegaStruct thrift table, which contains set columns with older description. 
Fixed the description.

 Thrift Set type not working with Hive
 -

 Key: HIVE-7892
 URL: https://issues.apache.org/jira/browse/HIVE-7892
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Satish Mittal
Assignee: Satish Mittal
 Attachments: HIVE-7892.patch.1.txt, HIVE-7892.patch.txt


 Thrift supports List, Map and Struct complex types, which get mapped to 
 Array, Map and Struct complex types in Hive respectively. However thrift Set 
 type doesn't seem to be working. 
 Here is an example thrift struct:
 {noformat}
 namespace java sample.thrift
 struct setrow {
 1: required seti32 ids,
 2: required string name,
 }
 {noformat}
 A Hive table is created with ROW FORMAT SERDE 
 'org.apache.hadoop.hive.serde2.thrift.ThriftDeserializer' WITH 
 SERDEPROPERTIES ('serialization.class'='sample.thrift.setrow', 
 'serialization.format'='org.apache.thrift.protocol.TBinaryProtocol').
 Describing the table shows:
 {noformat}
 hive describe settable; 
 OK
 ids   structfrom deserializer   
 namestringfrom deserializer
 {noformat}
 Issuing a select query on set column throws SemanticException:
 {noformat}
 hive select ids from settable;
 FAILED: SemanticException java.lang.IllegalArgumentException: Error: name 
 expected at the position 7 of 'struct' but '' is found.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25473: Thrift Set type not working with Hive

2014-09-10 Thread Satish Mittal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25473/
---

(Updated Sept. 10, 2014, 7:03 a.m.)


Review request for hive, Amareshwari Sriramadasu, Ashutosh Chauhan, and Navis 
Ryu.


Changes
---

The test convert_enum_to_string.q works with existing MegaStruct thrift table, 
which contains set columns with older description. Fixed the columns 
description in the updated patch.


Bugs: HIVE-7892
https://issues.apache.org/jira/browse/HIVE-7892


Repository: hive-git


Description
---

Thrift supports List, Map and Struct complex types, which get mapped to Array, 
Map and Struct complex types in Hive respectively. However thrift Set type 
doesn't get mapped to any Hive type, and hence doesn't work with 
ThriftDeserializer serde.


Diffs (updated)
-

  ql/src/test/results/beelinepositive/convert_enum_to_string.q.out 24acdcd 
  ql/src/test/results/clientpositive/convert_enum_to_string.q.out a1ef04f 
  serde/if/test/complex.thrift 308b64c 
  
serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/SetIntString.java
 PRE-CREATION 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java
 9a226b3 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardListObjectInspector.java
 6eb8803 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestThriftObjectInspectors.java
 5f692fb 

Diff: https://reviews.apache.org/r/25473/diff/


Testing
---

1) Added Unit test along with the fix.
2) Manually tested by creating a table with ThriftDeserializer serde and having 
thrift set columns:
   a) described the table
   b) issued query to select the set column


Thanks,

Satish Mittal



Re: Timeline for release of Hive 0.14

2014-09-10 Thread Satish Mittal
Hi,
Can you please include HIVE-7892 (Thrift Set type not working with Hive) as
well? It is under code review.

Regards,
Satish


On Tue, Sep 9, 2014 at 2:10 PM, Suma Shivaprasad 
sumasai.shivapra...@gmail.com wrote:

 Please include https://issues.apache.org/jira/browse/HIVE-7694  as well.
 It
 is currently under review by Amareshwari and should be done in the next
 couple of days.

 Thanks
 Suma


 On Mon, Sep 8, 2014 at 5:44 PM, Alan Gates ga...@hortonworks.com wrote:

  I'll review that.  I just need the time to test it against mysql, oracle,
  and hopefully sqlserver.  But I think we can do this post branch if we
 need
  to, as it's a bug fix rather than a feature.
 
  Alan.
 
Damien Carol dca...@blitzbs.com
   September 8, 2014 at 3:19
   Same request for https://issues.apache.org/jira/browse/HIVE-7689
 
  I already provided a patch, re-based it many times and I'm waiting for a
  review.
 
  Regards,
 
  Le 08/09/2014 12:08, amareshwarisr . a écrit :
 
amareshwarisr . amareshw...@gmail.com
   September 8, 2014 at 3:08
  Would like to include https://issues.apache.org/jira/browse/HIVE-2390
 and
  https://issues.apache.org/jira/browse/HIVE-7936 .
 
  I can review and merge them.
 
  Thanks
  Amareshwari
 
 
 
Vikram Dixit vik...@hortonworks.com
   September 5, 2014 at 17:53
  Hi Folks,
 
  I am going to start consolidating the items mentioned in this list and
  create a wiki page to track it. I will wait till the end of next week to
  create the branch taking into account Ashutosh's request.
 
  Thanks
  Vikram.
 
 
  On Fri, Sep 5, 2014 at 5:39 PM, Ashutosh Chauhan hashut...@apache.org
  hashut...@apache.org
 
Ashutosh Chauhan hashut...@apache.org
   September 5, 2014 at 17:39
  Vikram,
 
  Some of us are working on stabilizing cbo branch and trying to get it
  merged into trunk. We feel we are close. May I request to defer cutting
 the
  branch for few more days? Folks interested in this can track our progress
  here : https://issues.apache.org/jira/browse/HIVE-7946
 
  Thanks,
  Ashutosh
 
 
  On Fri, Aug 22, 2014 at 4:09 PM, Lars Francke lars.fran...@gmail.com
  lars.fran...@gmail.com
 
Lars Francke lars.fran...@gmail.com
   August 22, 2014 at 16:09
  Thank you for volunteering to do the release. I think a 0.14 release is a
  good idea.
 
  I have a couple of issues I'd like to get in too:
 
  * Either HIVE-7107[0] (Fix an issue in the HiveServer1 JDBC driver) or
  HIVE-6977[1] (Delete HiveServer1). The former needs a review the latter a
  patch
  * HIVE-6123[2] Checkstyle in Maven needs a review
 
  HIVE-7622[3]  HIVE-7543[4] are waiting for any reviews or comments on my
  previous thread[5]. I'd still appreciate any helpers for reviews or even
  just comments. I'd feel very sad if I had done all that work for nothing.
  Hoping this thread gives me a wider audience. Both patches fix up issues
  that should have been caught in earlier reviews as they are almost all
  Checkstyle or other style violations but they make for huge patches. I
  could also create hundreds of small issues or stop doing these things
  entirely
 
 
 
  [0] https://issues.apache.org/jira/browse/HIVE-7107 
  https://issues.apache.org/jira/browse/HIVE-7107 
  [1] https://issues.apache.org/jira/browse/HIVE-6977 
  https://issues.apache.org/jira/browse/HIVE-6977 
  [2] https://issues.apache.org/jira/browse/HIVE-6123 
  https://issues.apache.org/jira/browse/HIVE-6123 
  [3] https://issues.apache.org/jira/browse/HIVE-7622 
  https://issues.apache.org/jira/browse/HIVE-7622 
  [4] https://issues.apache.org/jira/browse/HIVE-7543 
  https://issues.apache.org/jira/browse/HIVE-7543 
 
  On Fri, Aug 22, 2014 at 11:01 PM, John Pullokkaran 
 
 
  --
  Sent with Postbox http://www.getpostbox.com 
 
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
  to which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 


-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete 

[jira] [Commented] (HIVE-7950) StorageHandler resources aren't added to Tez Session if already Session is already Open

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128178#comment-14128178
 ] 

Hive QA commented on HIVE-7950:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667581/HIVE-7950.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6194 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/719/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/719/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-719/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667581

 StorageHandler resources aren't added to Tez Session if already Session is 
 already Open
 ---

 Key: HIVE-7950
 URL: https://issues.apache.org/jira/browse/HIVE-7950
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler, Tez
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 0.14.0

 Attachments: HIVE-7950-1.diff, HIVE-7950.2.patch, 
 hive-7950-tez-WIP.diff


 Was trying to run some queries using the AccumuloStorageHandler when using 
 the Tez execution engine. Some things that classes which were added to 
 tmpjars weren't making it into the container. When a Tez Session is already 
 open, as is the normal case when simply using the `hive` command, the 
 resources aren't added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 23352: Support non-constant expressions for MAP type indices.

2014-09-10 Thread Lars Francke

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23352/#review52831
---


I'm really looking forward to this. Thanks for working on it!


ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
https://reviews.apache.org/r/23352/#comment91998

I suggest stating here that only Integers are supported.

Currently only integers are supported for array indexes

or something like that


- Lars Francke


On July 9, 2014, 6:57 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/23352/
 ---
 
 (Updated July 9, 2014, 6:57 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-7325
 https://issues.apache.org/jira/browse/HIVE-7325
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Here is my sample:
 {code}
 CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) 
 TBLPROPERTIES (hbase.table.name = RECORD); 
 
 
 CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) 
 TBLPROPERTIES (hbase.table.name = KEY_RECORD); 
 {code}
 The following join statement doesn't work. 
 {code}
 SELECT a.*, b.* from KEY_RECORD a join RECORD b 
 WHERE a.RecordId[b.RecordID] is not null;
 {code}
 FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
 supported. Error encountered near token 'RecordID' 
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9889cfe 
   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
 e44f5ae 
   ql/src/test/queries/clientpositive/array_map_access_nonconstant.q 
 PRE-CREATION 
   ql/src/test/queries/negative/invalid_list_index.q c40f079 
   ql/src/test/queries/negative/invalid_list_index2.q 99d0b3d 
   ql/src/test/queries/negative/invalid_map_index2.q 5828f07 
   ql/src/test/results/clientpositive/array_map_access_nonconstant.q.out 
 PRE-CREATION 
   ql/src/test/results/compiler/errors/invalid_list_index.q.out a4179cd 
   ql/src/test/results/compiler/errors/invalid_list_index2.q.out aaa9455 
   ql/src/test/results/compiler/errors/invalid_map_index2.q.out edc9bda 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
  5ccacf1 
 
 Diff: https://reviews.apache.org/r/23352/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Updated] (HIVE-7704) Create tez task for fast file merging

2014-09-10 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7704:
-
Attachment: HIVE-7704.9.patch

Hopefully will fix the test case.

 Create tez task for fast file merging
 -

 Key: HIVE-7704
 URL: https://issues.apache.org/jira/browse/HIVE-7704
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7704.1.patch, HIVE-7704.2.patch, HIVE-7704.3.patch, 
 HIVE-7704.4.patch, HIVE-7704.4.patch, HIVE-7704.5.patch, HIVE-7704.6.patch, 
 HIVE-7704.7.patch, HIVE-7704.8.patch, HIVE-7704.9.patch


 Currently tez falls back to MR task for merge file task. It will beneficial 
 to convert the merge file tasks to tez task to make use of the performance 
 gains from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-10 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7405:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Matt!

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
 Fix For: 0.14.0

 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch, HIVE-7405.996.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6550) SemanticAnalyzer.reset() doesn't clear all the state

2014-09-10 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6550:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Sergey!

 SemanticAnalyzer.reset() doesn't clear all the state
 

 Key: HIVE-6550
 URL: https://issues.apache.org/jira/browse/HIVE-6550
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: Laljo John Pullokkaran
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-6550.01.patch, HIVE-6550.02.patch, 
 HIVE-6550.03.patch, HIVE-6550.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6550) SemanticAnalyzer.reset() doesn't clear all the state

2014-09-10 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6550:
---
Component/s: Query Processor

 SemanticAnalyzer.reset() doesn't clear all the state
 

 Key: HIVE-6550
 URL: https://issues.apache.org/jira/browse/HIVE-6550
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: Laljo John Pullokkaran
Assignee: Sergey Shelukhin
 Fix For: 0.14.0

 Attachments: HIVE-6550.01.patch, HIVE-6550.02.patch, 
 HIVE-6550.03.patch, HIVE-6550.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-10 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128192#comment-14128192
 ] 

Damien Carol commented on HIVE-7689:


Tests errors are not related to this patch.

 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-10 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-7405:
-
Labels: TODOC14  (was: )

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch, HIVE-7405.996.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7818) Support boolean PPD for ORC

2014-09-10 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128194#comment-14128194
 ] 

Prasanth J commented on HIVE-7818:
--

Tested this patch with a small dataset. It works fine. +1

 Support boolean PPD for ORC
 ---

 Key: HIVE-7818
 URL: https://issues.apache.org/jira/browse/HIVE-7818
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.14.0

 Attachments: HIVE-7818.1.patch


 Currently ORC does collect stats for boolean field. However, the boolean 
 stats is not range based, instead, it collects counts of true records. 
 RecordReaderImpl.evaluatePredicate currently only deals with range based 
 stats, we need to improve it to deal with the boolean stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25468: HIVE-7777: add CSVSerde support

2014-09-10 Thread Lars Francke

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25468/#review52832
---



serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java
https://reviews.apache.org/r/25468/#comment91999

Thanks for moving these out. Could you make them static?



serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java
https://reviews.apache.org/r/25468/#comment92000

no need to wrap this



serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java
https://reviews.apache.org/r/25468/#comment92001

missing spaces



serde/src/test/org/apache/hadoop/hive/serde2/TestOpenCSVSerde.java
https://reviews.apache.org/r/25468/#comment92003

There's a couple more puts in this file that can be replaced with 
setProperty


- Lars Francke


On Sept. 9, 2014, 2:16 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25468/
 ---
 
 (Updated Sept. 9, 2014, 2:16 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-
 https://issues.apache.org/jira/browse/HIVE-
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-: add CSVSerde support
 
 
 Diffs
 -
 
   pom.xml 8973c2b52d0797d1f34859951de7349f7e5b996f 
   serde/pom.xml f8bcc830cfb298d739819db8fbaa2f98f221ccf3 
   serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/TestOpenCSVSerde.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25468/diff/
 
 
 Testing
 ---
 
 Unit test
 
 
 Thanks,
 
 cheng xu
 




Re: Review Request 25468: HIVE-7777: add CSVSerde support

2014-09-10 Thread Lars Francke


 On Sept. 9, 2014, 8:49 a.m., Lars Francke wrote:
  serde/src/java/org/apache/hadoop/hive/serde2/CSVSerde.java, line 151
  https://reviews.apache.org/r/25468/diff/1/?file=683467#file683467line151
 
  I don't quite get this comment. Looking at the two CSVReader 
  constructors they seem to do the same in this case. From how I understand 
  it this if-statement is not needed. Same for the newWriter method.
  
  Maybe I'm missing something?
 
 cheng xu wrote:
 The CSVParser will do a check work if the separator, quotechar or escape 
 is the same. If so, it will throw an exception. For this reason, we have to 
 replace with CSVParser.DEFAULT_ESCAPE_CHARACTER('\') if the escape is 
 DEFAULT_ESCAPE_CHARACTER('').

Ahh! I see now. That's a bit weird indeed. Thanks for the explanation.


- Lars


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25468/#review52688
---


On Sept. 9, 2014, 2:16 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25468/
 ---
 
 (Updated Sept. 9, 2014, 2:16 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-
 https://issues.apache.org/jira/browse/HIVE-
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-: add CSVSerde support
 
 
 Diffs
 -
 
   pom.xml 8973c2b52d0797d1f34859951de7349f7e5b996f 
   serde/pom.xml f8bcc830cfb298d739819db8fbaa2f98f221ccf3 
   serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/TestOpenCSVSerde.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25468/diff/
 
 
 Testing
 ---
 
 Unit test
 
 
 Thanks,
 
 cheng xu
 




[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-09-10 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128198#comment-14128198
 ] 

Lefty Leverenz commented on HIVE-7405:
--

Doc note:  This adds configuration parameter 
*hive.vectorized.execution.reduce.enabled* to HiveConf.java, so it needs to be 
documented in the wiki:

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
 --

 Key: HIVE-7405
 URL: https://issues.apache.org/jira/browse/HIVE-7405
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
 HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
 HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
 HIVE-7405.93.patch, HIVE-7405.94.patch, HIVE-7405.95.patch, 
 HIVE-7405.96.patch, HIVE-7405.97.patch, HIVE-7405.98.patch, 
 HIVE-7405.99.patch, HIVE-7405.991.patch, HIVE-7405.994.patch, 
 HIVE-7405.995.patch, HIVE-7405.996.patch


 Vectorize the basic case that does not have any count distinct aggregation.
 Add a 4th processing mode in VectorGroupByOperator for reduce where each 
 input VectorizedRowBatch has only values for one key at a time.  Thus, the 
 values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7946) CBO: Merge CBO changes to Trunk

2014-09-10 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128202#comment-14128202
 ] 

Lars Francke commented on HIVE-7946:


I'll try to look at the code issues in the next few days.

 CBO: Merge CBO changes to Trunk
 ---

 Key: HIVE-7946
 URL: https://issues.apache.org/jira/browse/HIVE-7946
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-7946.1.patch, HIVE-7946.2.patch, HIVE-7946.3.patch, 
 HIVE-7946.4.patch, HIVE-7946.5.patch, HIVE-7946.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25178: Add DROP TABLE PURGE

2014-09-10 Thread Lefty Leverenz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25178/#review52834
---



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
https://reviews.apache.org/r/25178/#comment92005

typo:  falser should be false


- Lefty Leverenz


On Sept. 9, 2014, 6:51 p.m., david seraf wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25178/
 ---
 
 (Updated Sept. 9, 2014, 6:51 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add PURGE option to DROP TABLE command to skip saving table data to the trash
 
 
 Diffs
 -
 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
  be7134f 
   
 hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/tool/TestTempletonUtils.java
  af952f2 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
  da51a55 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 9489949 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 a94a7a3 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreFsImpl.java 
 cff0718 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 cbdba30 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreFS.java 
 a141793 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 613b709 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e387b8f 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
  4cf98d8 
   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 f31a409 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 32db0c7 
   ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java ba30e1f 
   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 406aae9 
   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveRemote.java 1a5ba87 
   ql/src/test/queries/clientpositive/drop_table_purge.q PRE-CREATION 
   ql/src/test/results/clientpositive/drop_table_purge.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25178/diff/
 
 
 Testing
 ---
 
 added code test and added QL test.  Tests passed in CI, but other, unrelated 
 tests failed.
 
 
 Thanks,
 
 david seraf
 




[jira] [Updated] (HIVE-6147) Support avro data stored in HBase columns

2014-09-10 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-6147:
-
Labels: TODOC14  (was: )

 Support avro data stored in HBase columns
 -

 Key: HIVE-6147
 URL: https://issues.apache.org/jira/browse/HIVE-6147
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.12.0, 0.13.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, 
 HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, 
 HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt


 Presently, the HBase Hive integration supports querying only primitive data 
 types in columns. It would be nice to be able to store and query Avro objects 
 in HBase columns by making them visible as structs to Hive. This will allow 
 Hive to perform ad hoc analysis of HBase data which can be deeply structured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns

2014-09-10 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128218#comment-14128218
 ] 

Lefty Leverenz commented on HIVE-6147:
--

Doc question:  Will this be documented in the HBase Integration design doc or 
the Avro SerDe doc, or a new doc?  (The HBase doc has a list of open issues, 
but this one isn't on the list.)

* [HBase Integration -- Open Issues (JIRA) | 
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-OpenIssues(JIRA)]
* [Avro SerDe | https://cwiki.apache.org/confluence/display/Hive/AvroSerDe]

 Support avro data stored in HBase columns
 -

 Key: HIVE-6147
 URL: https://issues.apache.org/jira/browse/HIVE-6147
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.12.0, 0.13.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, 
 HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt, HIVE-6147.4.patch.txt, 
 HIVE-6147.5.patch.txt, HIVE-6147.6.patch.txt


 Presently, the HBase Hive integration supports querying only primitive data 
 types in columns. It would be nice to be able to store and query Avro objects 
 in HBase columns by making them visible as structs to Hive. This will allow 
 Hive to perform ad hoc analysis of HBase data which can be deeply structured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7776) enable sample10.q.[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7776:

Attachment: HIVE-7776.1-spark.patch

Hive get task Id through 2 ways in Utilities::getTaskId:
# get parameter value of mapred.task.id from configuration.
# generate random value while #1 return null.
Currently, Hive on Spark can't get parameter value of mapred.task.id from 
configuration.

FileSinkOperator use taskid to distinct different bucket file name, 
FileSinkOperator should take taskid as field variable and initiate it only once 
since one FileSinkOperator instance only refered in one task. but 
FileSinkOperator call Utilities::getTaskId to get new taskId each time, for 
this issue, it would cause more bucket files than bucket number, which lead to 
unexpected result of tablesample queries.

 enable sample10.q.[Spark Branch]
 

 Key: HIVE-7776
 URL: https://issues.apache.org/jira/browse/HIVE-7776
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7776.1-spark.patch


 sample10.q contain dynamic partition operation, should enable this qtest 
 after hive on spark support dynamic partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7776) enable sample10.q.[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7776:

Status: Patch Available  (was: Open)

 enable sample10.q.[Spark Branch]
 

 Key: HIVE-7776
 URL: https://issues.apache.org/jira/browse/HIVE-7776
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7776.1-spark.patch


 sample10.q contain dynamic partition operation, should enable this qtest 
 after hive on spark support dynamic partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 25495: HIVE-7776, enable sample10.q

2014-09-10 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25495/
---

Review request for hive, Brock Noland and Xuefu Zhang.


Bugs: HIVE-7776
https://issues.apache.org/jira/browse/HIVE-7776


Repository: hive-git


Description
---

Hive get task Id through 2 ways in Utilities::getTaskId:
get parameter value of mapred.task.id from configuration.
generate random value while #1 return null.
Currently, Hive on Spark can't get parameter value of mapred.task.id from 
configuration.
FileSinkOperator use taskid to distinct different bucket file name, 
FileSinkOperator should take taskid as field variable and initiate it only once 
since one FileSinkOperator instance only refered in one task. but 
FileSinkOperator call Utilities::getTaskId to get new taskId each time, for 
this issue, it would cause more bucket files than bucket number, which lead to 
unexpected result of tablesample queries.


Diffs
-

  itests/src/test/resources/testconfiguration.properties 155abad 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 3ff0782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 02f9d99 
  ql/src/test/results/clientpositive/spark/sample10.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/25495/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Commented] (HIVE-8035) Add SORT_QUERY_RESULTS for test that doesn't guarantee order

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128233#comment-14128233
 ] 

Hive QA commented on HIVE-8035:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667610/HIVE-8035.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6193 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/721/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/721/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-721/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667610

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order
 

 Key: HIVE-8035
 URL: https://issues.apache.org/jira/browse/HIVE-8035
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-8035.patch


 Some test query doesn't guarantee output order, e.g. group by, union all. 
 Therefore we should add {{-- SORT_QUERY_RESULTS}} to the qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8035) Add SORT_QUERY_RESULTS for test that doesn't guarantee order

2014-09-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128236#comment-14128236
 ] 

Rui Li commented on HIVE-8035:
--

I noted there's an age-1 failure:
{code}
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{code}
But I'm not sure if it's related to the patch.
cc [~xuefuz], [~brocknoland]

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order
 

 Key: HIVE-8035
 URL: https://issues.apache.org/jira/browse/HIVE-8035
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-8035.patch


 Some test query doesn't guarantee output order, e.g. group by, union all. 
 Therefore we should add {{-- SORT_QUERY_RESULTS}} to the qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7627) FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7627:

Status: Patch Available  (was: Open)

 FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]
 -

 Key: HIVE-7627
 URL: https://issues.apache.org/jira/browse/HIVE-7627
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: spark-m1
 Attachments: HIVE-7627.1-spark.patch, HIVE-7627.2-spark.patch


 Hive table statistic failed on FSStatsPublisher mode, with the following 
 exception in Spark executor side:
 {noformat}
 14/08/05 16:46:24 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.FileNotFoundException: ID mismatch. Request id and saved id: 20277 , 
 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1442)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): ID 
 mismatch. Request id and saved id: 20277 , 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native 

[jira] [Updated] (HIVE-7627) FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7627:

Attachment: HIVE-7627.2-spark.patch

make taskId field variable of FSStatPublisher would resolve this issue either, 
since SPARK-2895 is still under review, we could enable random generated taskId 
first.

 FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]
 -

 Key: HIVE-7627
 URL: https://issues.apache.org/jira/browse/HIVE-7627
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: spark-m1
 Attachments: HIVE-7627.1-spark.patch, HIVE-7627.2-spark.patch


 Hive table statistic failed on FSStatsPublisher mode, with the following 
 exception in Spark executor side:
 {noformat}
 14/08/05 16:46:24 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.FileNotFoundException: ID mismatch. Request id and saved id: 20277 , 
 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1442)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): ID 
 mismatch. Request id and saved id: 20277 , 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at 

[jira] [Commented] (HIVE-8035) Add SORT_QUERY_RESULTS for test that doesn't guarantee order

2014-09-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128249#comment-14128249
 ] 

Rui Li commented on HIVE-8035:
--

I also have one concern that some qfile contains both cases with and without a 
guaranteed order. For example in {{limit_pushdown.q}}, we have both:
{{select key,value from src order by key desc limit 20;}}
and
{{select value, sum(key + 1) as sum from src group by value limit 20;}}

If we add {{-- SORT_QUERY_RESULTS}}, the generated results can be different 
from the expected, e.g. for an {{order by desc}} query.
Do you think this is OK?

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order
 

 Key: HIVE-8035
 URL: https://issues.apache.org/jira/browse/HIVE-8035
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-8035.patch


 Some test query doesn't guarantee output order, e.g. group by, union all. 
 Therefore we should add {{-- SORT_QUERY_RESULTS}} to the qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 25497: HIVE-7627, FSStatsPublisher does fit into Spark multi-thread task mode

2014-09-10 Thread chengxiang li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25497/
---

Review request for hive, Brock Noland and Xuefu Zhang.


Bugs: HIVE-7627
https://issues.apache.org/jira/browse/HIVE-7627


Repository: hive-git


Description
---

make taskId field variable of FSStatPublisher would resolve this issue either, 
since SPARK-2895 is still under review, we could enable random generated taskId 
first.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java cb010fb 

Diff: https://reviews.apache.org/r/25497/diff/


Testing
---


Thanks,

chengxiang li



[jira] [Commented] (HIVE-7776) enable sample10.q.[Spark Branch]

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128293#comment-14128293
 ] 

Hive QA commented on HIVE-7776:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667632/HIVE-7776.1-spark.patch

{color:red}ERROR:{color} -1 due to 161 failed/errored test(s), 6344 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explode_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby4_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby5_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby6_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby6_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby6_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_map_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_noskew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leftsemijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4

[jira] [Commented] (HIVE-8019) Missing commit from trunk : `export/import statement update`

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128309#comment-14128309
 ] 

Hive QA commented on HIVE-8019:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667613/HIVE-8019.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6195 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/722/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/722/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-722/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667613

 Missing commit from trunk : `export/import statement update`
 

 Key: HIVE-8019
 URL: https://issues.apache.org/jira/browse/HIVE-8019
 Project: Hive
  Issue Type: Bug
  Components: Import/Export
Affects Versions: 0.14.0
Reporter: Mohit Sabharwal
Assignee: Thejas M Nair
Priority: Blocker
 Attachments: HIVE-8019.1.patch, HIVE-8019.2.patch


 Noticed that commit 1882de7810fc55a2466dd4cbe74ed67bb41cb667 exists in 0.13 
 branch, but not it trunk. 
 https://github.com/apache/hive/commit/1882de7810fc55a2466dd4cbe74ed67bb41cb667
 {code}
 (trunk) $ git branch -a --contains 1882de7810fc55a2466dd4cbe74ed67bb41cb667
 remotes/origin/branch-0.13
 {code}
 I looked through some of the changes in this commit and don't see those in 
 trunk.  Nor do I see a commit that reverts these changes in trunk.
 [~thejas], should we port this over to trunk ? 
 Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25492: HIVE-7936 - Thrift Union support

2014-09-10 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25492/#review52844
---



serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ThriftObjectInspectorUtils.java
https://reviews.apache.org/r/25492/#comment92014

Remove this method if not used


- Amareshwari Sriramadasu


On Sept. 10, 2014, 5:27 a.m., Suma Shivaprasad wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25492/
 ---
 
 (Updated Sept. 10, 2014, 5:27 a.m.)
 
 
 Review request for hive, Amareshwari Sriramadasu and Ashutosh Chauhan.
 
 
 Bugs: HIVE-7936
 https://issues.apache.org/jira/browse/HIVE-7936
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 ThriftDeserializer currently does not support UNION types
 
 
 Diffs
 -
 
   contrib/src/test/results/clientpositive/udf_example_arraymapstruct.q.out 
 e876cdd 
   data/files/complex.seq c27d5c09b1da881d8fd6fb2aaa1f5d169d1de3ae 
   ql/src/test/queries/clientpositive/input_lazyserde.q 69c0d04 
   ql/src/test/results/clientnegative/describe_xpath1.q.out d81c96e 
   ql/src/test/results/clientnegative/describe_xpath2.q.out 2bd0f06 
   ql/src/test/results/clientpositive/case_sensitivity.q.out 8684557 
   ql/src/test/results/clientpositive/columnarserde_create_shortcut.q.out 
 4805836 
   ql/src/test/results/clientpositive/input17.q.out 8fff21b 
   ql/src/test/results/clientpositive/input5.q.out 7524ca7 
   ql/src/test/results/clientpositive/input_columnarserde.q.out 13cfb7f 
   ql/src/test/results/clientpositive/input_dynamicserde.q.out ebcf1d8 
   ql/src/test/results/clientpositive/input_lazyserde.q.out 0f685f2 
   ql/src/test/results/clientpositive/input_testxpath.q.out 3f4b96e 
   ql/src/test/results/clientpositive/input_testxpath2.q.out af1e999 
   ql/src/test/results/clientpositive/input_testxpath3.q.out b31b2f3 
   ql/src/test/results/clientpositive/input_testxpath4.q.out 3dca8bf 
   ql/src/test/results/clientpositive/inputddl8.q.out fc13356 
   ql/src/test/results/clientpositive/join_thrift.q.out e1588c5 
   ql/src/test/results/clientpositive/udf_case_thrift.q.out 0fc8e84 
   ql/src/test/results/clientpositive/udf_coalesce.q.out 0d32476 
   ql/src/test/results/clientpositive/udf_isnull_isnotnull.q.out 1f600b4 
   ql/src/test/results/clientpositive/udf_size.q.out d7a4fa2 
   ql/src/test/results/clientpositive/union21.q.out 0e47ff4 
   serde/if/test/complex.thrift 308b64c 
   serde/src/gen/thrift/gen-cpp/complex_types.h 17991d4 
   serde/src/gen/thrift/gen-cpp/complex_types.cpp 9526d3d 
   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java
  e36a792 
   
 serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/PropValueUnion.java
  PRE-CREATION 
   serde/src/gen/thrift/gen-py/complex/ttypes.py 7283e4c 
   serde/src/gen/thrift/gen-rb/complex_types.rb 5527096 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java
  9a226b3 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ReflectionStructObjectInspector.java
  ee5b0d0 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ThriftObjectInspectorUtils.java
  PRE-CREATION 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ThriftUnionObjectInspector.java
  PRE-CREATION 
   
 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorUtils.java
  a18f4a7 
   
 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestThriftObjectInspectors.java
  5f692fb 
   
 serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java
  7269cd0 
 
 Diff: https://reviews.apache.org/r/25492/diff/
 
 
 Testing
 ---
 
 input_lazyserde.q
 
 
 Thanks,
 
 Suma Shivaprasad
 




[jira] [Commented] (HIVE-7936) Support for handling Thrift Union types

2014-09-10 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128313#comment-14128313
 ] 

Amareshwari Sriramadasu commented on HIVE-7936:
---

The code changes look fine. Put a few comments on the review board. 
Since the patch involves a binary file change, i think jenkins wont be able to 
apply the patch. Can you run the tests on a local machine and update the result 
here?

 Support for handling Thrift Union types 
 

 Key: HIVE-7936
 URL: https://issues.apache.org/jira/browse/HIVE-7936
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7936.1.patch, HIVE-7936.patch, complex.seq


 Currently hive does not support thrift unions through ThriftDeserializer. 
 Need to add support for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7627) FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128338#comment-14128338
 ] 

Hive QA commented on HIVE-7627:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667637/HIVE-7627.2-spark.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6343 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/123/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/123/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-123/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667637

 FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]
 -

 Key: HIVE-7627
 URL: https://issues.apache.org/jira/browse/HIVE-7627
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: spark-m1
 Attachments: HIVE-7627.1-spark.patch, HIVE-7627.2-spark.patch


 Hive table statistic failed on FSStatsPublisher mode, with the following 
 exception in Spark executor side:
 {noformat}
 14/08/05 16:46:24 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.FileNotFoundException: ID mismatch. Request id and saved id: 20277 , 
 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1442)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): ID 
 mismatch. Request id and saved id: 20277 , 20278 for file 
 

[jira] [Commented] (HIVE-7777) add CSV support for Serde

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128381#comment-14128381
 ] 

Hive QA commented on HIVE-:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667616/HIVE-.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6196 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/723/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/723/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-723/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667616

 add CSV support for Serde
 -

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-.1.patch, HIVE-.patch, csv-serde-master.zip


 There is no official support for csvSerde for hive while there is an open 
 source project in github(https://github.com/ogrodnek/csv-serde). CSV is of 
 high frequency in use as a data format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8030) NullPointerException on getSchemas

2014-09-10 Thread Shiv Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128403#comment-14128403
 ] 

Shiv Prakash commented on HIVE-8030:


Yes, I'm using hive-0.13.1.

 NullPointerException on getSchemas
 --

 Key: HIVE-8030
 URL: https://issues.apache.org/jira/browse/HIVE-8030
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, JDBC
Affects Versions: 0.13.1
 Environment: Linux (Ubuntu 12.04)
Reporter: Shiv Prakash
  Labels: hadoop
 Fix For: 0.13.1


 java.lang.NullPointerException
   at java.util.ArrayList.init(ArrayList.java:164)
   at 
 org.apache.hadoop.hive.jdbc.HiveMetaDataResultSet.init(HiveMetaDataResultSet.java:32)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData$3.init(HiveDatabaseMetaData.java:482)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:481)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:476)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$DatabaseMetaDataInvocationHandler.invoke(DriverProxyInvocationChain.java:368)
   at com.sun.proxy.$Proxy20.getSchemas(Unknown Source)
   at org.pentaho.di.core.database.Database.getSchemas(Database.java:3857)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.getSchemaNames(TableOutputDialog.java:1036)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.access$2400(TableOutputDialog.java:94)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog$24.widgetSelected(TableOutputDialog.java:863)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.open(TableOutputDialog.java:884)
   at 
 org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:124)
   at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:8648)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:3020)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.mouseDoubleClick(TransGraph.java:737)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1297)
   at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7801)
   at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9130)
   at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:638)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.pentaho.commons.launcher.Launcher.main(Launcher.java:151)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-10 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128404#comment-14128404
 ] 

Alan Gates commented on HIVE-7689:
--

Are you 100% certain?  The TestCompactor uses the transaction tables in the 
metastore.

 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8030) NullPointerException on getSchemas

2014-09-10 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128413#comment-14128413
 ] 

Lars Francke commented on HIVE-8030:


I'm sorry but those numbers don't match up. See for yourself in the link above 
what's going on in line 32 in release 0.13.1.

It instead matches up perfectly with what was available in version 0.7.1 and 
before: 
https://github.com/apache/hive/blob/release-0.7.1/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveMetaDataResultSet.java

The other line numbers don't match up either. Could you paste more information 
for example about your classpath? I'm relatively certain that somehow you are 
not using a vanilla 0.13.1 release. Maybe Pentaho messes something up.

 NullPointerException on getSchemas
 --

 Key: HIVE-8030
 URL: https://issues.apache.org/jira/browse/HIVE-8030
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema, JDBC
Affects Versions: 0.13.1
 Environment: Linux (Ubuntu 12.04)
Reporter: Shiv Prakash
  Labels: hadoop
 Fix For: 0.13.1


 java.lang.NullPointerException
   at java.util.ArrayList.init(ArrayList.java:164)
   at 
 org.apache.hadoop.hive.jdbc.HiveMetaDataResultSet.init(HiveMetaDataResultSet.java:32)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData$3.init(HiveDatabaseMetaData.java:482)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:481)
   at 
 org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getSchemas(HiveDatabaseMetaData.java:476)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$DatabaseMetaDataInvocationHandler.invoke(DriverProxyInvocationChain.java:368)
   at com.sun.proxy.$Proxy20.getSchemas(Unknown Source)
   at org.pentaho.di.core.database.Database.getSchemas(Database.java:3857)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.getSchemaNames(TableOutputDialog.java:1036)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.access$2400(TableOutputDialog.java:94)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog$24.widgetSelected(TableOutputDialog.java:863)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at 
 org.pentaho.di.ui.trans.steps.tableoutput.TableOutputDialog.open(TableOutputDialog.java:884)
   at 
 org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep(SpoonStepsDelegate.java:124)
   at org.pentaho.di.ui.spoon.Spoon.editStep(Spoon.java:8648)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.editStep(TransGraph.java:3020)
   at 
 org.pentaho.di.ui.spoon.trans.TransGraph.mouseDoubleClick(TransGraph.java:737)
   at org.eclipse.swt.widgets.TypedListener.handleEvent(Unknown Source)
   at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
   at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
   at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
   at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1297)
   at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7801)
   at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9130)
   at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:638)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.pentaho.commons.launcher.Launcher.main(Launcher.java:151)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-10 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128427#comment-14128427
 ] 

Damien Carol commented on HIVE-7689:


Using this command with last trunk and this patch applied :
{code}
mvn -B -o test -Phadoop-2 -Dtest=TestCompactor
{code}


 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7627) FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7627:

Status: Open  (was: Patch Available)

 FSStatsPublisher does fit into Spark multi-thread task mode[Spark Branch]
 -

 Key: HIVE-7627
 URL: https://issues.apache.org/jira/browse/HIVE-7627
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: spark-m1
 Attachments: HIVE-7627.1-spark.patch, HIVE-7627.2-spark.patch


 Hive table statistic failed on FSStatsPublisher mode, with the following 
 exception in Spark executor side:
 {noformat}
 14/08/05 16:46:24 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.FileNotFoundException: ID mismatch. Request id and saved id: 20277 , 
 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1442)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): ID 
 mismatch. Request id and saved id: 20277 , 20278 for file 
 /tmp/hive-root/8833d172-1edd-4508-86db-fdd7a1b0af17/hive_2014-08-05_16-46-03_013_6279446857294757772-1/-ext-1/tmpstats-0
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeId.checkId(INodeId.java:53)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2952)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2754)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2662)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native 

[jira] [Commented] (HIVE-2149) Fix ant target generate-schema

2014-09-10 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128448#comment-14128448
 ] 

Lars Francke commented on HIVE-2149:


Okay, I'll reopen the issue but won't work on it.

The problem is that the DataNucleus Maven Plugin requires a database connection 
even to create the Schema. That means we need to provide a profile or some 
other way to get a driver on the classpath.

 Fix ant target generate-schema 
 ---

 Key: HIVE-2149
 URL: https://issues.apache.org/jira/browse/HIVE-2149
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Priority: Minor

 Running generate-schema target in metastore dir results in
 generate-schema:
  [java] Exception in thread main java.lang.NoClassDefFoundError: 
 org/jpox/SchemaTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2149) Provide a way to generate an SQL file with the Metastore schema

2014-09-10 Thread Lars Francke (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Francke updated HIVE-2149:
---
Summary: Provide a way to generate an SQL file with the Metastore schema  
(was: Fix ant target generate-schema )

 Provide a way to generate an SQL file with the Metastore schema
 ---

 Key: HIVE-2149
 URL: https://issues.apache.org/jira/browse/HIVE-2149
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Priority: Minor

 Running generate-schema target in metastore dir results in
 generate-schema:
  [java] Exception in thread main java.lang.NoClassDefFoundError: 
 org/jpox/SchemaTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2149) Provide a way to generate an SQL file with the Metastore schema

2014-09-10 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128449#comment-14128449
 ] 

Lars Francke commented on HIVE-2149:


Turns out I can't reopen issuescan you?

 Provide a way to generate an SQL file with the Metastore schema
 ---

 Key: HIVE-2149
 URL: https://issues.apache.org/jira/browse/HIVE-2149
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Priority: Minor

 Running generate-schema target in metastore dir results in
 generate-schema:
  [java] Exception in thread main java.lang.NoClassDefFoundError: 
 org/jpox/SchemaTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7776) enable sample10.q.[Spark Branch]

2014-09-10 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7776:

Status: Open  (was: Patch Available)

 enable sample10.q.[Spark Branch]
 

 Key: HIVE-7776
 URL: https://issues.apache.org/jira/browse/HIVE-7776
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7776.1-spark.patch


 sample10.q contain dynamic partition operation, should enable this qtest 
 after hive on spark support dynamic partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7704) Create tez task for fast file merging

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128454#comment-14128454
 ] 

Hive QA commented on HIVE-7704:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667624/HIVE-7704.9.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6201 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_split_elimination
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/724/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/724/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-724/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667624

 Create tez task for fast file merging
 -

 Key: HIVE-7704
 URL: https://issues.apache.org/jira/browse/HIVE-7704
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7704.1.patch, HIVE-7704.2.patch, HIVE-7704.3.patch, 
 HIVE-7704.4.patch, HIVE-7704.4.patch, HIVE-7704.5.patch, HIVE-7704.6.patch, 
 HIVE-7704.7.patch, HIVE-7704.8.patch, HIVE-7704.9.patch


 Currently tez falls back to MR task for merge file task. It will beneficial 
 to convert the merge file tasks to tez task to make use of the performance 
 gains from tez. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-10 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-7689:
---
Attachment: HIVE-7689.8.patch

Add modifications in {{prepDB}} et {{cleanDb}} methods of {{TxnDbUtil}}.

 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7689.8.patch, HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, 
 HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-2390) Add UNIONTYPE serialization support to LazyBinarySerDe

2014-09-10 Thread Suma Shivaprasad (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128486#comment-14128486
 ] 

Suma Shivaprasad commented on HIVE-2390:


Carl,

I am working on a related feature to support UNIONTYPE in ThriftDeserializer as 
well.
Since I am a fairly new contributor to Hive and not aware of the existing 
issues in UNIONTYPE feature, if someone could identify the missing pieces and 
raise jiras, i can take a stab at it.


 Add UNIONTYPE serialization support to LazyBinarySerDe
 --

 Key: HIVE-2390
 URL: https://issues.apache.org/jira/browse/HIVE-2390
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Jakob Homan
Assignee: Suma Shivaprasad
  Labels: TODOC14, uniontype
 Fix For: 0.14.0

 Attachments: HIVE-2390.1.patch, HIVE-2390.patch


 When the union type was introduced, full support for it wasn't provided.  For 
 instance, when working with a union that gets passed to LazyBinarySerde: 
 {noformat}Caused by: java.lang.RuntimeException: Unrecognized type: UNION
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:468)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serializeStruct(LazyBinarySerDe.java:230)
   at 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe.serialize(LazyBinarySerDe.java:184)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8017) Use HiveKey instead of BytesWritable as key type of the pair RDD [Spark Branch]

2014-09-10 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128501#comment-14128501
 ] 

Brock Noland commented on HIVE-8017:


No that's an infra issue. The test framework uses ec2 which can at times be 
flaky. 

 Use HiveKey instead of BytesWritable as key type of the pair RDD [Spark 
 Branch]
 ---

 Key: HIVE-8017
 URL: https://issues.apache.org/jira/browse/HIVE-8017
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-8017-spark.patch, HIVE-8017.2-spark.patch, 
 HIVE-8017.3-spark.patch


 HiveKey should be used as the key type because it holds the hash code for 
 partitioning. While BytesWritable serves partitioning well for simple cases, 
 we have to use {{HiveKey.hashCode}} for more complicated ones, e.g. join, 
 bucketed table, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8035) Add SORT_QUERY_RESULTS for test that doesn't guarantee order

2014-09-10 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128507#comment-14128507
 ] 

Brock Noland commented on HIVE-8035:


+1

The test {{TestOrcHCatLoader.testReadDataPrimitiveTypes}} seems to be quite 
flaky.

with regards to {{limit_pushdown.q}}, I think the test that is important there 
is that the limit is pushed down as shown in the explain plan. It's not a 
correctness test for order by.

 Add SORT_QUERY_RESULTS for test that doesn't guarantee order
 

 Key: HIVE-8035
 URL: https://issues.apache.org/jira/browse/HIVE-8035
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Rui Li
Assignee: Rui Li
Priority: Minor
 Attachments: HIVE-8035.patch


 Some test query doesn't guarantee output order, e.g. group by, union all. 
 Therefore we should add {{-- SORT_QUERY_RESULTS}} to the qfiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128565#comment-14128565
 ] 

Hive QA commented on HIVE-7689:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667660/HIVE-7689.8.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6193 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/725/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/725/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-725/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667660

 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7689.8.patch, HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, 
 HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7812) Disable CombineHiveInputFormat when ACID format is used

2014-09-10 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-7812:

Attachment: HIVE-7812.patch

Updated to Ashutosh's comments.

 Disable CombineHiveInputFormat when ACID format is used
 ---

 Key: HIVE-7812
 URL: https://issues.apache.org/jira/browse/HIVE-7812
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-7812.patch, HIVE-7812.patch, HIVE-7812.patch


 Currently the HiveCombineInputFormat complains when called on an ACID 
 directory. Modify HiveCombineInputFormat so that HiveInputFormat is used 
 instead if the directory is ACID format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Debugging Hive frontend in eclipse

2014-09-10 Thread Saumitra Shahapure
Hello,

I am new to Hive dev community,

I am trying to debug Hive frontend (till semantic analysis) from eclipse. I
want to start from Main in CliDriver. I don't want to go debugging till
execution and don't care if it fails.

As described in
https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-DebuggingHiveCode
, I am able to debug TestCliDriver from eclipse and several tests pass
showing that metastore is working fine.

The problem is that, when I start debugging from CliDriver, metastore is
not initialized properly. So semantic analysis fails at getMetadata call .

Is any additional setup required to get metadata work properly from eclipse
debugging?

-- Saumitra S. Shahapure


[jira] [Updated] (HIVE-8034) Don't add colon when no port is specified

2014-09-10 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8034:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

 Don't add colon when no port is specified
 -

 Key: HIVE-8034
 URL: https://issues.apache.org/jira/browse/HIVE-8034
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.14.0

 Attachments: HIVE-8034.1.patch


 In HIVE-4910 we added a {{:}} even if there was no port due to HADOOP-9776. 
 Now that this is fixed I think we should fix ours as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2149) Provide a way to generate an SQL file with the Metastore schema

2014-09-10 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-2149:
---
Component/s: Metastore

 Provide a way to generate an SQL file with the Metastore schema
 ---

 Key: HIVE-2149
 URL: https://issues.apache.org/jira/browse/HIVE-2149
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Ashutosh Chauhan
Priority: Minor

 Running generate-schema target in metastore dir results in
 generate-schema:
  [java] Exception in thread main java.lang.NoClassDefFoundError: 
 org/jpox/SchemaTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-2149) Provide a way to generate an SQL file with the Metastore schema

2014-09-10 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-2149:


 Provide a way to generate an SQL file with the Metastore schema
 ---

 Key: HIVE-2149
 URL: https://issues.apache.org/jira/browse/HIVE-2149
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Ashutosh Chauhan
Priority: Minor

 Running generate-schema target in metastore dir results in
 generate-schema:
  [java] Exception in thread main java.lang.NoClassDefFoundError: 
 org/jpox/SchemaTool



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7936) Support for handling Thrift Union types

2014-09-10 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7936:
---
Attachment: HIVE-7936.2.patch

Fixed parsing test case output failure mismatches

 Support for handling Thrift Union types 
 

 Key: HIVE-7936
 URL: https://issues.apache.org/jira/browse/HIVE-7936
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7936.1.patch, HIVE-7936.2.patch, HIVE-7936.patch, 
 complex.seq


 Currently hive does not support thrift unions through ThriftDeserializer. 
 Need to add support for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25468: HIVE-7777: add CSVSerde support

2014-09-10 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25468/#review52883
---


This looks great! I think we are almost ready to commit. Can you add a new test 
(e.g. ql/src/test/queries/clientpositive/serde_csv.q) which runs a couple 
queries? e.g. ql/src/test/queries/clientpositive/serde_regex.q

- Brock Noland


On Sept. 9, 2014, 2:16 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25468/
 ---
 
 (Updated Sept. 9, 2014, 2:16 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-
 https://issues.apache.org/jira/browse/HIVE-
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-: add CSVSerde support
 
 
 Diffs
 -
 
   pom.xml 8973c2b52d0797d1f34859951de7349f7e5b996f 
   serde/pom.xml f8bcc830cfb298d739819db8fbaa2f98f221ccf3 
   serde/src/java/org/apache/hadoop/hive/serde2/OpenCSVSerde.java PRE-CREATION 
   serde/src/test/org/apache/hadoop/hive/serde2/TestOpenCSVSerde.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25468/diff/
 
 
 Testing
 ---
 
 Unit test
 
 
 Thanks,
 
 cheng xu
 




[jira] [Created] (HIVE-8038) Decouple ORC files split calculation logic from Filesystem's get file location implementation

2014-09-10 Thread Pankit Thapar (JIRA)
Pankit Thapar created HIVE-8038:
---

 Summary: Decouple ORC files split calculation logic from 
Filesystem's get file location implementation
 Key: HIVE-8038
 URL: https://issues.apache.org/jira/browse/HIVE-8038
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0


What is the Current Logic
==
1.get the file blocks from FileSystem.getFileBlockLocations() which returns an 
array of BlockLocation
2.In SplitGenerator.createSplit(), check if split only spans one block or 
multiple blocks.
3.If split spans just one block, then using the array index (index = 
offset/blockSize), get the corresponding host having the blockLocation
4.If the split spans multiple blocks, then get all hosts that have at least 80% 
of the max of total data in split hosted by any host.
5.add the split to a list of splits

Issue with Current Logic
=
Dependency on FileSystem API’s logic for block location calculations. It 
returns an array and we need to rely on FileSystem to  
make all blocks of same size if we want to directly access a block from the 
array.
 
What is the Fix
=
1a.get the file blocks from FileSystem.getFileBlockLocations() which returns an 
array of BlockLocation
1b.convert the array into a tree map offset, BlockLocation and return it 
through getLocationsWithOffSet()
2.In SplitGenerator.createSplit(), check if split only spans one block or 
multiple blocks.
3.If split spans just one block, then using Tree.floorEntry(key), get the 
highest entry smaller than offset for the split and get the corresponding host.
4a.If the split spans multiple blocks, get a submap, which contains all entries 
containing blockLocations from the offset to offset + length
4b.get all hosts that have at least 80% of the max of total data in split 
hosted by any host.
5.add the split to a list of splits

What are the major changes in logic
==
1. store BlockLocations in a Map instead of an array
2. Call SHIMS.getLocationsWithOffSet() instead of getLocations()
3. one block case is checked by if(offset + length = start.getOffset() + 
start.getLength())  instead of if((offset % blockSize) + length = blockSize)

What is the affect on Complexity (Big O)
=

1. We add a O(n) loop to build a TreeMap from an array but its a one time cost 
and would not be called for each split
2. In case of one block case, we can get the block in O(logn) worst case which 
was O(1) before
3. Getting the submap is O(logn)
4. In case of multiple block case, building the list of hosts is O(m) which was 
O(n)  m  n as previously we were iterating 
   over all the block locations but now we are only iterating only blocks that 
belong to that range go offsets that we need. 

What are the benefits of the change
==
1. With this fix, we do not depend on the blockLocations returned by FileSystem 
to figure out the block corresponding to the offset and blockSize
2. Also, it is not necessary that block lengths is same for all blocks for all 
FileSystems
3. Previously we were using blockSize for one block case and block.length for 
multiple block case, which is not the case now. We figure out the block
   depending upon the actual length and offset of the block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8038) Decouple ORC files split calculation logic from Filesystem's get file location implementation

2014-09-10 Thread Pankit Thapar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankit Thapar updated HIVE-8038:

Attachment: HIVE-8038.patch

 Decouple ORC files split calculation logic from Filesystem's get file 
 location implementation
 -

 Key: HIVE-8038
 URL: https://issues.apache.org/jira/browse/HIVE-8038
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0

 Attachments: HIVE-8038.patch


 What is the Current Logic
 ==
 1.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using the array index (index = 
 offset/blockSize), get the corresponding host having the blockLocation
 4.If the split spans multiple blocks, then get all hosts that have at least 
 80% of the max of total data in split hosted by any host.
 5.add the split to a list of splits
 Issue with Current Logic
 =
 Dependency on FileSystem API’s logic for block location calculations. It 
 returns an array and we need to rely on FileSystem to  
 make all blocks of same size if we want to directly access a block from the 
 array.
  
 What is the Fix
 =
 1a.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 1b.convert the array into a tree map offset, BlockLocation and return it 
 through getLocationsWithOffSet()
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using Tree.floorEntry(key), get the 
 highest entry smaller than offset for the split and get the corresponding 
 host.
 4a.If the split spans multiple blocks, get a submap, which contains all 
 entries containing blockLocations from the offset to offset + length
 4b.get all hosts that have at least 80% of the max of total data in split 
 hosted by any host.
 5.add the split to a list of splits
 What are the major changes in logic
 ==
 1. store BlockLocations in a Map instead of an array
 2. Call SHIMS.getLocationsWithOffSet() instead of getLocations()
 3. one block case is checked by if(offset + length = start.getOffset() + 
 start.getLength())  instead of if((offset % blockSize) + length = 
 blockSize)
 What is the affect on Complexity (Big O)
 =
 1. We add a O(n) loop to build a TreeMap from an array but its a one time 
 cost and would not be called for each split
 2. In case of one block case, we can get the block in O(logn) worst case 
 which was O(1) before
 3. Getting the submap is O(logn)
 4. In case of multiple block case, building the list of hosts is O(m) which 
 was O(n)  m  n as previously we were iterating 
over all the block locations but now we are only iterating only blocks 
 that belong to that range go offsets that we need. 
 What are the benefits of the change
 ==
 1. With this fix, we do not depend on the blockLocations returned by 
 FileSystem to figure out the block corresponding to the offset and blockSize
 2. Also, it is not necessary that block lengths is same for all blocks for 
 all FileSystems
 3. Previously we were using blockSize for one block case and block.length for 
 multiple block case, which is not the case now. We figure out the block
depending upon the actual length and offset of the block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8038) Decouple ORC files split calculation logic from Filesystem's get file location implementation

2014-09-10 Thread Pankit Thapar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankit Thapar updated HIVE-8038:

Status: Patch Available  (was: Open)

 Decouple ORC files split calculation logic from Filesystem's get file 
 location implementation
 -

 Key: HIVE-8038
 URL: https://issues.apache.org/jira/browse/HIVE-8038
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0

 Attachments: HIVE-8038.patch


 What is the Current Logic
 ==
 1.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using the array index (index = 
 offset/blockSize), get the corresponding host having the blockLocation
 4.If the split spans multiple blocks, then get all hosts that have at least 
 80% of the max of total data in split hosted by any host.
 5.add the split to a list of splits
 Issue with Current Logic
 =
 Dependency on FileSystem API’s logic for block location calculations. It 
 returns an array and we need to rely on FileSystem to  
 make all blocks of same size if we want to directly access a block from the 
 array.
  
 What is the Fix
 =
 1a.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 1b.convert the array into a tree map offset, BlockLocation and return it 
 through getLocationsWithOffSet()
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using Tree.floorEntry(key), get the 
 highest entry smaller than offset for the split and get the corresponding 
 host.
 4a.If the split spans multiple blocks, get a submap, which contains all 
 entries containing blockLocations from the offset to offset + length
 4b.get all hosts that have at least 80% of the max of total data in split 
 hosted by any host.
 5.add the split to a list of splits
 What are the major changes in logic
 ==
 1. store BlockLocations in a Map instead of an array
 2. Call SHIMS.getLocationsWithOffSet() instead of getLocations()
 3. one block case is checked by if(offset + length = start.getOffset() + 
 start.getLength())  instead of if((offset % blockSize) + length = 
 blockSize)
 What is the affect on Complexity (Big O)
 =
 1. We add a O(n) loop to build a TreeMap from an array but its a one time 
 cost and would not be called for each split
 2. In case of one block case, we can get the block in O(logn) worst case 
 which was O(1) before
 3. Getting the submap is O(logn)
 4. In case of multiple block case, building the list of hosts is O(m) which 
 was O(n)  m  n as previously we were iterating 
over all the block locations but now we are only iterating only blocks 
 that belong to that range go offsets that we need. 
 What are the benefits of the change
 ==
 1. With this fix, we do not depend on the blockLocations returned by 
 FileSystem to figure out the block corresponding to the offset and blockSize
 2. Also, it is not necessary that block lengths is same for all blocks for 
 all FileSystems
 3. Previously we were using blockSize for one block case and block.length for 
 multiple block case, which is not the case now. We figure out the block
depending upon the actual length and offset of the block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7086) TestHiveServer2.testConnection is failing on trunk

2014-09-10 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128724#comment-14128724
 ] 

Ashutosh Chauhan commented on HIVE-7086:


+1

 TestHiveServer2.testConnection is failing on trunk
 --

 Key: HIVE-7086
 URL: https://issues.apache.org/jira/browse/HIVE-7086
 Project: Hive
  Issue Type: Test
  Components: HiveServer2, JDBC
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-7086.1.patch, HIVE-7086.2.patch, HIVE-7086.3.patch


 Able to repro locally on fresh checkout



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7812) Disable CombineHiveInputFormat when ACID format is used

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128749#comment-14128749
 ] 

Hive QA commented on HIVE-7812:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667764/HIVE-7812.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6193 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_select_dummy_source
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/726/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/726/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-726/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667764

 Disable CombineHiveInputFormat when ACID format is used
 ---

 Key: HIVE-7812
 URL: https://issues.apache.org/jira/browse/HIVE-7812
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-7812.patch, HIVE-7812.patch, HIVE-7812.patch


 Currently the HiveCombineInputFormat complains when called on an ACID 
 directory. Modify HiveCombineInputFormat so that HiveInputFormat is used 
 instead if the directory is ACID format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Debugging Hive frontend in eclipse

2014-09-10 Thread Thejas Nair
There have been some threads on how to get around the metastore
initialization issue.
But another easy way to work around this issue is to build hive, and
then run hive --debug . Hive will wait for the debugger to connect on
port 8000. You can configure eclipse debugging to connect to that
port.


On Wed, Sep 10, 2014 at 7:09 AM, Saumitra Shahapure
saumitra.offic...@gmail.com wrote:
 Hello,

 I am new to Hive dev community,

 I am trying to debug Hive frontend (till semantic analysis) from eclipse. I
 want to start from Main in CliDriver. I don't want to go debugging till
 execution and don't care if it fails.

 As described in
 https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-DebuggingHiveCode
 , I am able to debug TestCliDriver from eclipse and several tests pass
 showing that metastore is working fine.

 The problem is that, when I start debugging from CliDriver, metastore is
 not initialized properly. So semantic analysis fails at getMetadata call .

 Is any additional setup required to get metadata work properly from eclipse
 debugging?

 -- Saumitra S. Shahapure

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-8038) Decouple ORC files split calculation logic from Filesystem's get file location implementation

2014-09-10 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128768#comment-14128768
 ] 

Gopal V commented on HIVE-8038:
---

This is an interesting change-set. 

bq. 4a.If the split spans multiple blocks, get a submap, which contains all 
entries containing blockLocations from the offset to offset + length

For ORC to be really fast, we enforce that a stripe (the smallest split you can 
get) always fits within a block - this is true for HDFS at least, because it 
can specify a preferred block size when creating files.

From an elegance point of view, I like the TreeMap.floorEntry() over a for 
loop - but I have never seen the 4A/4B scenarios when using Hive-13.

bq. 2. Also, it is not necessary that block lengths is same for all blocks for 
all FileSystems

This is something to be fixed anyway - as HDFS-3689 will allow variable length 
blocks in HDFS as well.

 Decouple ORC files split calculation logic from Filesystem's get file 
 location implementation
 -

 Key: HIVE-8038
 URL: https://issues.apache.org/jira/browse/HIVE-8038
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0

 Attachments: HIVE-8038.patch


 What is the Current Logic
 ==
 1.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using the array index (index = 
 offset/blockSize), get the corresponding host having the blockLocation
 4.If the split spans multiple blocks, then get all hosts that have at least 
 80% of the max of total data in split hosted by any host.
 5.add the split to a list of splits
 Issue with Current Logic
 =
 Dependency on FileSystem API’s logic for block location calculations. It 
 returns an array and we need to rely on FileSystem to  
 make all blocks of same size if we want to directly access a block from the 
 array.
  
 What is the Fix
 =
 1a.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 1b.convert the array into a tree map offset, BlockLocation and return it 
 through getLocationsWithOffSet()
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using Tree.floorEntry(key), get the 
 highest entry smaller than offset for the split and get the corresponding 
 host.
 4a.If the split spans multiple blocks, get a submap, which contains all 
 entries containing blockLocations from the offset to offset + length
 4b.get all hosts that have at least 80% of the max of total data in split 
 hosted by any host.
 5.add the split to a list of splits
 What are the major changes in logic
 ==
 1. store BlockLocations in a Map instead of an array
 2. Call SHIMS.getLocationsWithOffSet() instead of getLocations()
 3. one block case is checked by if(offset + length = start.getOffset() + 
 start.getLength())  instead of if((offset % blockSize) + length = 
 blockSize)
 What is the affect on Complexity (Big O)
 =
 1. We add a O(n) loop to build a TreeMap from an array but its a one time 
 cost and would not be called for each split
 2. In case of one block case, we can get the block in O(logn) worst case 
 which was O(1) before
 3. Getting the submap is O(logn)
 4. In case of multiple block case, building the list of hosts is O(m) which 
 was O(n)  m  n as previously we were iterating 
over all the block locations but now we are only iterating only blocks 
 that belong to that range go offsets that we need. 
 What are the benefits of the change
 ==
 1. With this fix, we do not depend on the blockLocations returned by 
 FileSystem to figure out the block corresponding to the offset and blockSize
 2. Also, it is not necessary that block lengths is same for all blocks for 
 all FileSystems
 3. Previously we were using blockSize for one block case and block.length for 
 multiple block case, which is not the case now. We figure out the block
depending upon the actual length and offset of the block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7984) AccumuloOutputFormat Configuration items from StorageHandler not re-set in Configuration in Tez

2014-09-10 Thread Josh Elser (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HIVE-7984:
-
Attachment: HIVE-7984-1.patch

Same changes, but named the original attachment wrong. Fixing suffix to trigger 
HIVE-QA

 AccumuloOutputFormat Configuration items from StorageHandler not re-set in 
 Configuration in Tez
 ---

 Key: HIVE-7984
 URL: https://issues.apache.org/jira/browse/HIVE-7984
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler, Tez
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 0.14.0

 Attachments: HIVE-7984-1.diff, HIVE-7984-1.patch


 Ran AccumuloStorageHandler queries with Tez and found that configuration 
 elements that are pulled from the {{-hiveconf}} and passed to the 
 inputJobProperties or outputJobProperties by the AccumuloStorageHandler 
 aren't available inside of the Tez container.
 I'm guessing that there is a disconnect from the configuration that the 
 StorageHandler creates and what the Tez container sees.
 The HBaseStorageHandler likely doesn't run into this because it expects to 
 have hbase-site.xml available via tmpjars (and can extrapolate connection 
 information from that file). Accumulo's site configuration file is not meant 
 to be shared with consumers which means that this exact approach is not 
 sufficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive User Group Meeting

2014-09-10 Thread Xuefu Zhang
Hi all,

I'm very excited as we are just about one month away from the meetup. Here
is a list of talks that will be delivered in the coming Hive user group
meeting.

1. Julian Hyde, cost-based optimization, Optiq, and materialized views
2. Xuefu Zhang, Hive on Spark
3. George Chow, Updates on Hive Thrift Protocol
4. Prasad Mujumdar, What's new in Apache Sentry

We still have a couple of slots open, so please let me know if you're
interested in giving a talk. In the meantime, please RSVP if you plan to
join the event.

Thanks,
Xuefu



On Tue, Aug 26, 2014 at 6:37 PM, Xuefu Zhang xzh...@cloudera.com wrote:

 Dear Apache Hive users and developers,

 The next Hive user group meeting mentioned previously was officially
 announced here:
 http://www.meetup.com/Hive-User-Group-Meeting/events/202007872/. As it's
 only about one and a half month away, please RSVP if you plan to go so that
 the organizers can plan the meeting accordingly.

 Currently, we still have a few talk slots open. Please let me know if
 you're interested to give a talk.

 Regards,
 Xuefu


 On Mon, Jul 7, 2014 at 6:01 PM, Xuefu Zhang xzh...@cloudera.com wrote:

 Dear Hive users,

 Hive community is considering a user group meeting during Hadoop World
 that will be held in New York October 15-17th. To make this happen, your
 support is essential. First, I'm wondering if any user, especially those in
 New York area would be willing to host the meetup. Secondly, I'm soliciting
 talks from users as well as developers, and so please propose or share your
 thoughts on the contents of the meetup.

 I will soon setup a meetup event to  formally announce this. In the
 meantime, your suggestions, comments, and kind assistance are greatly
 appreciated.

 Sincerely,

 Xuefu





[jira] [Updated] (HIVE-8022) Recursive root scratch directory creation is not using hdfs umask properly

2014-09-10 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8022:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the review [~thejas]!

 Recursive root scratch directory creation is not using hdfs umask properly 
 ---

 Key: HIVE-8022
 URL: https://issues.apache.org/jira/browse/HIVE-8022
 Project: Hive
  Issue Type: Bug
  Components: CLI, HiveServer2
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-8022.1.patch, HIVE-8022.2.patch, HIVE-8022.3.patch


 Changes made in HIVE-6847 removed the helper methods that were added 
 HIVE-7001 to get around this problem. Since the root scratch dir must be 
 writable by all, its creation should use those methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7818) Support boolean PPD for ORC

2014-09-10 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7818:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~daijy] for the patch.

 Support boolean PPD for ORC
 ---

 Key: HIVE-7818
 URL: https://issues.apache.org/jira/browse/HIVE-7818
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.14.0

 Attachments: HIVE-7818.1.patch


 Currently ORC does collect stats for boolean field. However, the boolean 
 stats is not range based, instead, it collects counts of true records. 
 RecordReaderImpl.evaluatePredicate currently only deals with range based 
 stats, we need to improve it to deal with the boolean stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6109) Support customized location for EXTERNAL tables created by Dynamic Partitioning

2014-09-10 Thread karthik (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128873#comment-14128873
 ] 

karthik commented on HIVE-6109:
---

Satish Mittal,

You participation is phenomenal in this forum and very helpful for new users 
like me.

I need to use dynamic partitioning -Custom pattern and i am missing something 
very obvious.

Do i need to set up  hcat.dynamic.partitioning.custom.pattern in Hive CLI as 
both Hcatalog and Hive are integrated together. ? My path for partition in 
external table is asusual like data/year=2013/month=jan/
But i need data/year/month.. Do i need to amend location for this external 
table.?

Please accept apologies if i sound very basic. Thanks in advance

 Support customized location for EXTERNAL tables created by Dynamic 
 Partitioning
 ---

 Key: HIVE-6109
 URL: https://issues.apache.org/jira/browse/HIVE-6109
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Satish Mittal
Assignee: Satish Mittal
 Fix For: 0.13.0

 Attachments: HIVE-6109.1.patch.txt, HIVE-6109.2.patch.txt, 
 HIVE-6109.3.patch.txt, HIVE-6109.pdf


 Currently when dynamic partitions are created by HCatalog, the underlying 
 directories for the partitions are created in a fixed 'Hive-style' format, 
 i.e. root_dir/key1=value1/key2=value2/ and so on. However in case of 
 external table, user should be able to control the format of directories 
 created for dynamic partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8038) Decouple ORC files split calculation logic from Filesystem's get file location implementation

2014-09-10 Thread Pankit Thapar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128871#comment-14128871
 ] 

Pankit Thapar commented on HIVE-8038:
-

Hi,

Thanks for the feedback.
1. The use case where the split may span more than one block would be when  
Math.min(MAX_BLOCK_SIZE, 2 * stripeSize) returns MAX_BLOCK_SIZE as the size of 
the block for the file.
Example : stripe size 512MB and BLOCK SIZE is 400MB, in that case, split would 
span more than one block.

2. I see that HDFS wants to support variable length blocks but what I meant was 
to remove the usage of blockSize variable all together as that is not true for 
all the FileSystems. We want to generalize the usage for  FileSystems apart 
from HDFS.

 Decouple ORC files split calculation logic from Filesystem's get file 
 location implementation
 -

 Key: HIVE-8038
 URL: https://issues.apache.org/jira/browse/HIVE-8038
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0

 Attachments: HIVE-8038.patch


 What is the Current Logic
 ==
 1.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using the array index (index = 
 offset/blockSize), get the corresponding host having the blockLocation
 4.If the split spans multiple blocks, then get all hosts that have at least 
 80% of the max of total data in split hosted by any host.
 5.add the split to a list of splits
 Issue with Current Logic
 =
 Dependency on FileSystem API’s logic for block location calculations. It 
 returns an array and we need to rely on FileSystem to  
 make all blocks of same size if we want to directly access a block from the 
 array.
  
 What is the Fix
 =
 1a.get the file blocks from FileSystem.getFileBlockLocations() which returns 
 an array of BlockLocation
 1b.convert the array into a tree map offset, BlockLocation and return it 
 through getLocationsWithOffSet()
 2.In SplitGenerator.createSplit(), check if split only spans one block or 
 multiple blocks.
 3.If split spans just one block, then using Tree.floorEntry(key), get the 
 highest entry smaller than offset for the split and get the corresponding 
 host.
 4a.If the split spans multiple blocks, get a submap, which contains all 
 entries containing blockLocations from the offset to offset + length
 4b.get all hosts that have at least 80% of the max of total data in split 
 hosted by any host.
 5.add the split to a list of splits
 What are the major changes in logic
 ==
 1. store BlockLocations in a Map instead of an array
 2. Call SHIMS.getLocationsWithOffSet() instead of getLocations()
 3. one block case is checked by if(offset + length = start.getOffset() + 
 start.getLength())  instead of if((offset % blockSize) + length = 
 blockSize)
 What is the affect on Complexity (Big O)
 =
 1. We add a O(n) loop to build a TreeMap from an array but its a one time 
 cost and would not be called for each split
 2. In case of one block case, we can get the block in O(logn) worst case 
 which was O(1) before
 3. Getting the submap is O(logn)
 4. In case of multiple block case, building the list of hosts is O(m) which 
 was O(n)  m  n as previously we were iterating 
over all the block locations but now we are only iterating only blocks 
 that belong to that range go offsets that we need. 
 What are the benefits of the change
 ==
 1. With this fix, we do not depend on the blockLocations returned by 
 FileSystem to figure out the block corresponding to the offset and blockSize
 2. Also, it is not necessary that block lengths is same for all blocks for 
 all FileSystems
 3. Previously we were using blockSize for one block case and block.length for 
 multiple block case, which is not the case now. We figure out the block
depending upon the actual length and offset of the block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Debugging Hive frontend in eclipse

2014-09-10 Thread Saumitra Shahapure
Hey Thejas,

It seems that hive --debug is also not smooth.
I ran the script build/dist/bin/hive --debug after clean build

It gives error

ERROR: Cannot load this JVM TI agent twice, check your java command line
for duplicate jdwp options.
Error occurred during initialization of VM
agent library failed to init: jdwp

Am I missing something here?

-- Saumitra S. Shahapure

On Wed, Sep 10, 2014 at 10:49 PM, Thejas Nair the...@hortonworks.com
wrote:

 There have been some threads on how to get around the metastore
 initialization issue.
 But another easy way to work around this issue is to build hive, and
 then run hive --debug . Hive will wait for the debugger to connect on
 port 8000. You can configure eclipse debugging to connect to that
 port.


 On Wed, Sep 10, 2014 at 7:09 AM, Saumitra Shahapure
 saumitra.offic...@gmail.com wrote:
  Hello,
 
  I am new to Hive dev community,
 
  I am trying to debug Hive frontend (till semantic analysis) from
 eclipse. I
  want to start from Main in CliDriver. I don't want to go debugging till
  execution and don't care if it fails.
 
  As described in
 
 https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-DebuggingHiveCode
  , I am able to debug TestCliDriver from eclipse and several tests pass
  showing that metastore is working fine.
 
  The problem is that, when I start debugging from CliDriver, metastore is
  not initialized properly. So semantic analysis fails at getMetadata call
 .
 
  Is any additional setup required to get metadata work properly from
 eclipse
  debugging?
 
  -- Saumitra S. Shahapure

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



Re: Review Request 25178: Add DROP TABLE PURGE

2014-09-10 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25178/#review52911
---



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/25178/#comment92121

Nit: should we just pass ifPurge as boolean to the method unless envContext 
is also used for something else. This seemingly makes the called method cleaner.


- Xuefu Zhang


On Sept. 9, 2014, 6:51 p.m., david seraf wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25178/
 ---
 
 (Updated Sept. 9, 2014, 6:51 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add PURGE option to DROP TABLE command to skip saving table data to the trash
 
 
 Diffs
 -
 
   
 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java
  be7134f 
   
 hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/tool/TestTempletonUtils.java
  af952f2 
   
 itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java
  da51a55 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 9489949 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
 a94a7a3 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreFsImpl.java 
 cff0718 
   metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
 cbdba30 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreFS.java 
 a141793 
   metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 613b709 
   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e387b8f 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
  4cf98d8 
   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 f31a409 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 32db0c7 
   ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java ba30e1f 
   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 406aae9 
   ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveRemote.java 1a5ba87 
   ql/src/test/queries/clientpositive/drop_table_purge.q PRE-CREATION 
   ql/src/test/results/clientpositive/drop_table_purge.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25178/diff/
 
 
 Testing
 ---
 
 added code test and added QL test.  Tests passed in CI, but other, unrelated 
 tests failed.
 
 
 Thanks,
 
 david seraf
 




[jira] [Updated] (HIVE-8037) CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions

2014-09-10 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8037:
-
Attachment: HIVE-8037.1.patch

 CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions
 -

 Key: HIVE-8037
 URL: https://issues.apache.org/jira/browse/HIVE-8037
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-8037.1.patch, HIVE-8037.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Debugging Hive frontend in eclipse

2014-09-10 Thread Thejas Nair
I have never seen that before. Maybe you have some env setting (hadoop
or hive) that is messing with it ?
Edit the shell script to print the 'java' command it is running and
see if you can figure out what is wrong.


On Wed, Sep 10, 2014 at 11:26 AM, Saumitra Shahapure
saumitra.offic...@gmail.com wrote:
 Hey Thejas,

 It seems that hive --debug is also not smooth.
 I ran the script build/dist/bin/hive --debug after clean build

 It gives error

 ERROR: Cannot load this JVM TI agent twice, check your java command line
 for duplicate jdwp options.
 Error occurred during initialization of VM
 agent library failed to init: jdwp

 Am I missing something here?

 -- Saumitra S. Shahapure

 On Wed, Sep 10, 2014 at 10:49 PM, Thejas Nair the...@hortonworks.com
 wrote:

 There have been some threads on how to get around the metastore
 initialization issue.
 But another easy way to work around this issue is to build hive, and
 then run hive --debug . Hive will wait for the debugger to connect on
 port 8000. You can configure eclipse debugging to connect to that
 port.


 On Wed, Sep 10, 2014 at 7:09 AM, Saumitra Shahapure
 saumitra.offic...@gmail.com wrote:
  Hello,
 
  I am new to Hive dev community,
 
  I am trying to debug Hive frontend (till semantic analysis) from
 eclipse. I
  want to start from Main in CliDriver. I don't want to go debugging till
  execution and don't care if it fails.
 
  As described in
 
 https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-DebuggingHiveCode
  , I am able to debug TestCliDriver from eclipse and several tests pass
  showing that metastore is working fine.
 
  The problem is that, when I start debugging from CliDriver, metastore is
  not initialized properly. So semantic analysis fails at getMetadata call
 .
 
  Is any additional setup required to get metadata work properly from
 eclipse
  debugging?
 
  -- Saumitra S. Shahapure

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-09-10 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128907#comment-14128907
 ] 

Xuefu Zhang commented on HIVE-7100:
---

Patch looks good to me. I have a minor comment on RB.

BTW, could you please fill in the review request a title reflecting this JIRA 
and also JIRA number for Bugs field for easy navigation?

 Users of hive should be able to specify skipTrash when dropping tables.
 ---

 Key: HIVE-7100
 URL: https://issues.apache.org/jira/browse/HIVE-7100
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Ravi Prakash
Assignee: Jayesh
 Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, 
 HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.8.patch, HIVE-7100.patch


 Users of our clusters are often running up against their quota limits because 
 of Hive tables. When they drop tables, they have to then manually delete the 
 files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
 should enable users to skipTrash directly when dropping tables.
 We should also be able to provide this functionality without polluting SQL 
 syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7936) Support for handling Thrift Union types

2014-09-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128909#comment-14128909
 ] 

Hive QA commented on HIVE-7936:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12667771/HIVE-7936.2.patch

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 6193 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnarserde_create_shortcut
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_columnarserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_dynamicserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_lazyserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_thrift
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case_thrift
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_isnull_isnotnull
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_size
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union21
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udf_example_arraymapstruct
org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/727/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/727/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-727/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12667771

 Support for handling Thrift Union types 
 

 Key: HIVE-7936
 URL: https://issues.apache.org/jira/browse/HIVE-7936
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.1
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7936.1.patch, HIVE-7936.2.patch, HIVE-7936.patch, 
 complex.seq


 Currently hive does not support thrift unions through ThriftDeserializer. 
 Need to add support for the same



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-09-10 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-7100:
-

Assignee: david serafini  (was: Jayesh)

 Users of hive should be able to specify skipTrash when dropping tables.
 ---

 Key: HIVE-7100
 URL: https://issues.apache.org/jira/browse/HIVE-7100
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Ravi Prakash
Assignee: david serafini
 Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, 
 HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.8.patch, HIVE-7100.patch


 Users of our clusters are often running up against their quota limits because 
 of Hive tables. When they drop tables, they have to then manually delete the 
 files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
 should enable users to skipTrash directly when dropping tables.
 We should also be able to provide this functionality without polluting SQL 
 syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8037) CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions

2014-09-10 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8037:
-
Attachment: HIVE-8037.2.patch

 CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions
 -

 Key: HIVE-8037
 URL: https://issues.apache.org/jira/browse/HIVE-8037
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-8037.1.patch, HIVE-8037.2.patch, HIVE-8037.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8039) [CBO] Handle repeated alias

2014-09-10 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-8039:
--

 Summary: [CBO] Handle repeated alias
 Key: HIVE-8039
 URL: https://issues.apache.org/jira/browse/HIVE-8039
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Relax condition in CBO of not allowing repeated alias.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 25513: Handle repeated alias.

2014-09-10 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25513/
---

Review request for hive and John Pullokkaran.


Bugs: HIVE-8039
https://issues.apache.org/jira/browse/HIVE-8039


Repository: hive-git


Description
---

Handle repeated alias.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java 22295a1 

Diff: https://reviews.apache.org/r/25513/diff/


Testing
---

limit_pushdown.q


Thanks,

Ashutosh Chauhan



[jira] [Updated] (HIVE-8037) CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions

2014-09-10 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8037:
-
Attachment: (was: HIVE-8037.3.patch)

 CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions
 -

 Key: HIVE-8037
 URL: https://issues.apache.org/jira/browse/HIVE-8037
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-8037.1.patch, HIVE-8037.2.patch, HIVE-8037.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8037) CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions

2014-09-10 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8037:
-
Attachment: HIVE-8037.3.patch

 CBO: Refactor Join condn gen code, loosen restrictions on Join Conditions
 -

 Key: HIVE-8037
 URL: https://issues.apache.org/jira/browse/HIVE-8037
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-8037.1.patch, HIVE-8037.2.patch, HIVE-8037.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7859) Tune zlib compression in ORC to account for the encoding strategy

2014-09-10 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128942#comment-14128942
 ] 

Prasanth J commented on HIVE-7859:
--

I like the patch. LGTM +1. Pending unit test runs.

Under COMPRESSION strategy have you tried using zlib.BEST_COMPRESSION instead 
of zlib.DEFAULT_COMPRESSION to see the changes to space vs time? 


 Tune zlib compression in ORC to account for the encoding strategy
 -

 Key: HIVE-7859
 URL: https://issues.apache.org/jira/browse/HIVE-7859
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-7859.1.patch, HIVE-7859.2.patch


 Currently ORC Zlib is slow because several compression strategies ZLib uses 
 is already done by ORC in itself (dictionary, RLE, bit-packing).
 We need to pick between Z_FILTERED, Z_HUFFMAN_ONLY, Z_RLE, Z_FIXED and 
 Z_DEFAULT_STRATEGY according to column stream type.
 For instance an RLE_V2 stream could a use Z_FILTERED compression without 
 invoking the rest of the strategies.
 The string streams can use Z_FIXED compression strategies and so on.
 The core limitation to stick to retain compatibility with the default 
 decompressor, so that these are automatically backward compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >