[jira] [Updated] (HIVE-3475) INLINE UDTF doesn't convert types properly

2012-12-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3475:


 Assignee: Navis
Affects Version/s: (was: 0.10.0)
   Status: Patch Available  (was: Open)

 INLINE UDTF doesn't convert types properly
 --

 Key: HIVE-3475
 URL: https://issues.apache.org/jira/browse/HIVE-3475
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3475.D7461.1.patch


 I suppose the issue is in line:
 this.forwardObj [ i ] = res.convertIfNecessary(rowList.get( i ), 
 f.getFieldObjectInspector());
 there is never reason for conversion, it should just be:
 this.forwardObj [ i ] = rowList.get( i )
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable 
 cannot be cast to java.lang.Long
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:203)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427)
   at 
 org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.serialize(ColumnarSerDe.java:169)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:569)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112)
   at 
 org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:63)
   at 
 org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3475) INLINE UDTF doesn't convert types properly

2012-12-18 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3475:
--

Attachment: HIVE-3475.D7461.1.patch

navis requested code review of HIVE-3475 [jira] INLINE UDTF doesn't convert 
types properly.
Reviewers: JIRA

  DPAL-1904 INLINE UDTF does not convert type properly

  I suppose the issue is in line:
  this.forwardObj [ i ] = res.convertIfNecessary(rowList.get( i ), 
f.getFieldObjectInspector());

  there is never reason for conversion, it should just be:
  this.forwardObj [ i ] = rowList.get( i )

  Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable 
cannot be cast to java.lang.Long
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39)
at 
org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:203)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.serialize(ColumnarSerDe.java:169)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:569)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
at 
org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112)
at 
org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:63)
at 
org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98)

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D7461

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFInline.java
  ql/src/test/queries/clientpositive/udf_inline.q
  ql/src/test/results/clientpositive/udf_inline.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/17901/

To: JIRA, navis


 INLINE UDTF doesn't convert types properly
 --

 Key: HIVE-3475
 URL: https://issues.apache.org/jira/browse/HIVE-3475
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Igor Kabiljo
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3475.D7461.1.patch


 I suppose the issue is in line:
 this.forwardObj [ i ] = res.convertIfNecessary(rowList.get( i ), 
 f.getFieldObjectInspector());
 there is never reason for conversion, it should just be:
 this.forwardObj [ i ] = rowList.get( i )
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable 
 cannot be cast to java.lang.Long
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:39)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:203)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427)
   at 
 org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.serialize(ColumnarSerDe.java:169)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:569)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112)
   at 
 

[jira] [Updated] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3796:


Attachment: HIVE-3796.4.patch.txt

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534753#comment-13534753
 ] 

Kevin Wilfong commented on HIVE-3796:
-

Fixed per comments.

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534754#comment-13534754
 ] 

Kevin Wilfong commented on HIVE-3796:
-

Running a full test run again.

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3796:


Status: Patch Available  (was: Open)

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-2693:
-

Attachment: HIVE-2693-12-SortableSerDe.patch

HIVE-2693-12-SortableSerDe.patch adds DECIMAL decimal to the LazySortableSerDe. 
It also adds tests for order by asc/desc, group by, distinct and join.

The serialization is similar as described before: signfactordigits

where:
sign = -1,0,1 (zero has it's own sign now)
factor = factor is the position of the first digit before the decimal point 
(1) or the first non zero after the decimal point
digits = string of digits of the number without the decimal point

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-1.patch.txt, HIVE-2693-all.patch, HIVE-2693-fix.patch, 
 HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: float and double calculation is inaccurate in Hive

2012-12-18 Thread Bharath Mundlapudi
We have solved this issue recently. It is not just a problem in Hive. Contact 
me offline if you need more details. 

-Bharath




 From: Johnny Zhang xiao...@cloudera.com
To: Johnny Zhang xiao...@cloudera.com; Mark Grover 
grover.markgro...@gmail.com; hive dev@hive.apache.org 
Sent: Monday, December 17, 2012 5:13 PM
Subject: Re: Review Request: float and double calculation is inaccurate in Hive
 


 On Dec. 18, 2012, 12:38 a.m., Mark Grover wrote:
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java,
   line 50
  https://reviews.apache.org/r/8653/diff/1/?file=240423#file240423line50
 
      10 seems to be a rather arbitrary number for scale. Any particular 
 reason you are using it? Maybe we should invoke the method where no scale 
 needs to be specified.
 
 Johnny Zhang wrote:
     Hi, Mark, thanks for reviewing it. The reason using 10 is because it is 
the same as mysql default precision setting. Just want to make the calculation 
result identical to mysql's

I think I did tried without specify scale, and the result is different from 
mysql. I agree hard coding the scale is not a good way. Open to other 
suggestions.


- Johnny


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8653/#review14625
---


On Dec. 18, 2012, 12:37 a.m., Johnny Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/8653/
 ---
 
 (Updated Dec. 18, 2012, 12:37 a.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 I found this during debug the e2e test failures. I found Hive miss calculate 
 the float and double value. Take float calculation as an example:
 hive select f from all100k limit 1;
 48308.98
 hive select f/10 from all100k limit 1;
 4830.898046875 --added 04875 in the end
 hive select f*1.01 from all100k limit 1;
 48792.0702734375 --should be 48792.0698
 It might be essentially the same problem as 
 http://effbot.org/pyfaq/why-are-floating-point-calculations-so-inaccurate.htm 
 But since e2e test compare the results with mysql and seems mysql does it 
 right, so it is worthy fixing it in Hive.
 
 
 This addresses bug HIVE-3715.
    https://issues.apache.org/jira/browse/HIVE-3715
 
 
 Diffs
 -
 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java
 1423224 
   
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMultiply.java
 1423224 
 
 Diff: https://reviews.apache.org/r/8653/diff/
 
 
 Testing
 ---
 
 I did test to compare the result with mysql default float precision setting, 
 the result is identical.
 
 query:          select f, f*1.01, f/10 from all100k limit 1;
 mysql result:   48309       48792.0702734375    4830.898046875
 hive result:    48308.98    48792.0702734375    4830.898046875
 
 
 I apply this patch and run the hive e2e test, and the tests all pass (without 
 this patch, 5 related failures)
 
 
 Thanks,
 
 Johnny Zhang
 


[jira] [Commented] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534768#comment-13534768
 ] 

Namit Jain commented on HIVE-3796:
--

+1

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3785) Core hive changes for HiveServer2 implementation

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3785:
-

Status: Open  (was: Patch Available)

initial comments

 Core hive changes for HiveServer2 implementation
 

 Key: HIVE-3785
 URL: https://issues.apache.org/jira/browse/HIVE-3785
 Project: Hive
  Issue Type: Sub-task
  Components: Authentication, Build Infrastructure, Configuration, 
 Thrift API
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HS2-changed-files-only.patch


 The subtask to track changes in the core hive components for HiveServer2 
 implementation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534942#comment-13534942
 ] 

Mark Grover commented on HIVE-2693:
---

Thanks for contributing, [~hagleitn]! I am just about to take a closer look at 
your latest patch but like we discussed offline, a better packed byte (with 
radix 256) might be a better solution to a digit (i.e. a radix 10) based 
serialization. Since we would need a sentinel character for terminating this 
arbitrary long sequence of bytes (regardless of what radix we use), we have 2 
options:
A. To use a smaller radix and reserve a particular byte as a terminator
B. To use a larger radix (like 256) and have certain escape character to escape 
in the terminator if it appears in the content, similar to what we have for 
string.

I would prefer the latter (option B). Any particular concerns from anybody 
about that?

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-1.patch.txt, HIVE-2693-all.patch, HIVE-2693-fix.patch, 
 HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3792) hive pom file has missing conf and scope mapping for compile configuration.

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3792:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and 0.10. Thanks, Ashish!

 hive pom file has missing conf and scope mapping for compile configuration. 
 

 Key: HIVE-3792
 URL: https://issues.apache.org/jira/browse/HIVE-3792
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0

 Attachments: HIVE-3792.patch


 hive-0.10.0 pom file has missing conf and scope mapping for compile 
 configuration. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3794) Oracle upgrade script for Hive is broken

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-3794.


Resolution: Fixed

Committed to trunk and 0.10. Thanks, Deepesh!

 Oracle upgrade script for Hive is broken
 

 Key: HIVE-3794
 URL: https://issues.apache.org/jira/browse/HIVE-3794
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
 Environment: Oracle 11g r2
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
Priority: Critical
 Fix For: 0.10.0

 Attachments: HIVE-3794.patch


 As part of Hive configuration for Oracle I ran the schema creation script for 
 Oracle. Here is what I observed when ran the script:
 % sqlplus hive/hive@xe
 SQL*Plus: Release 11.2.0.2.0 Production on Mon Dec 10 18:47:11 2012
 Copyright (c) 1982, 2011, Oracle.  All rights reserved.
 Connected to:
 Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production
 SQL @scripts/metastore/upgrade/oracle/hive-schema-0.10.0.oracle.sql;
 .
 ALTER TABLE SKEWED_STRING_LIST_VALUES ADD CONSTRAINT 
 SKEWED_STRING_LIST_VALUES_FK1 FOREIGN KEY (STRING_LIST_ID) REFERENCES 
 SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 .
 ALTER TABLE SKEWED_STRING_LIST_VALUES ADD CONSTRAINT 
 SKEWED_STRING_LIST_VALUES_FK1 FOREIGN KEY (STRING_LIST_ID) REFERENCES 
 SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 Table created.
 Table altered.
 Table altered.
 CREATE TABLE SKEWED_COL_VALUE_LOCATION_MAPPING
  *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 Table created.
 Table created.
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_PK PRIMARY KEY (SD_ID,STRING_LIST_ID_KID)
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_FK1 FOREIGN KEY (STRING_LIST_ID_KID) 
 REFERENCES SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_FK2 FOREIGN KEY (SD_ID) REFERENCES SDS 
 (SD_ID) INITIALLY DEFERRED
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 Table created.
 Table altered.
 ALTER TABLE SKEWED_VALUES ADD CONSTRAINT SKEWED_VALUES_FK1 FOREIGN KEY 
 (STRING_LIST_ID_EID) REFERENCES SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY 
 DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 Basically there are two issues here with the Oracle sql script:
 (1) Table SKEWED_STRING_LIST is created with the column SD_ID. Later the 
 script tries to reference STRING_LIST_ID column in SKEWED_STRING_LIST 
 which is obviously not there. Comparing the sql with that for other flavors 
 it seems it should be STRING_LIST_ID.
 (2) Table name SKEWED_COL_VALUE_LOCATION_MAPPING is too long for Oracle 
 which limits identifier names to 30 characters. Also impacted are identifiers 
 SKEWED_COL_VALUE_LOCATION_MAPPING_PK and 
 SKEWED_COL_VALUE_LOCATION_MAPPING_FK1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3814) Cannot drop partitions on table when using Oracle metastore

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3814:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and 0.10. Thanks, Deepesh!

 Cannot drop partitions on table when using Oracle metastore
 ---

 Key: HIVE-3814
 URL: https://issues.apache.org/jira/browse/HIVE-3814
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
 Environment: Oracle 11g r2
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
Priority: Critical
 Fix For: 0.10.0

 Attachments: HIVE-3814.patch


 Create a table with a partition. Try to drop the partition or the table 
 containing the partition. Following error is seen:
 FAILED: Error in metadata: 
 MetaException(message:javax.jdo.JDODataStoreException: Error executing JDOQL 
 query SELECT 
 'org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics' AS 
 NUCLEUS_TYPE,THIS.AVG_COL_LEN,THIS.COLUMN_NAME,THIS.COLUMN_TYPE,THIS.DB_NAME,THIS.DOUBLE_HIGH_VALUE,THIS.DOUBLE_LOW_VALUE,THIS.LAST_ANALYZED,THIS.LONG_HIGH_VALUE,THIS.LONG_LOW_VALUE,THIS.MAX_COL_LEN,THIS.NUM_DISTINCTS,THIS.NUM_FALSES,THIS.NUM_NULLS,THIS.NUM_TRUES,THIS.PARTITION_NAME,THIS.TABLE_NAME,THIS.CS_ID
  FROM PART_COL_STATS THIS LEFT OUTER JOIN PARTITIONS 
 THIS_PARTITION_PARTITION_NAME ON THIS.PART_ID = 
 THIS_PARTITION_PARTITION_NAME.PART_ID WHERE 
 THIS_PARTITION_PARTITION_NAME.PART_NAME = ? AND THIS.DB_NAME = ? AND 
 THIS.TABLE_NAME = ? : ORA-00904: THIS.PARTITION_NAME: invalid 
 identifier
 The problem here is that the column PARTITION_NAME that the query is 
 referring to in table PART_COL_STATS is non-existent. Looking at the hive 
 schema scripts for mysql  derby, this should be PARTITION_NAME. Postgres 
 also suffers from the same problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #233

2012-12-18 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/

--
[...truncated 9916 lines...]

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/serde/test/classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/service/src/test/resources
 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/ivy/report/org.apache.hive-hive-service-default.html

ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/service/test/classes

test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common-secure/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20S/java
 against hadoop 1.0.0 
(https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/hadoopcore/hadoop-1.0.0)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.23

build_shims:
 [echo] Project: shims
 [echo] Compiling 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common-secure/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.23/java
 against hadoop 0.23.3 
(https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/233/artifact/hive/build/hadoopcore/hadoop-0.23.3)


[jira] [Commented] (HIVE-3646) Add 'IGNORE PROTECTION' predicate for dropping partitions

2012-12-18 Thread Andrew Chalfant (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535017#comment-13535017
 ] 

Andrew Chalfant commented on HIVE-3646:
---

Namit Jain, done

 Add 'IGNORE PROTECTION' predicate for dropping partitions
 -

 Key: HIVE-3646
 URL: https://issues.apache.org/jira/browse/HIVE-3646
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Andrew Chalfant
Assignee: Andrew Chalfant
Priority: Minor
 Fix For: 0.11

 Attachments: HIVE-3646.1.patch.txt, HIVE-3646.2.patch.txt, 
 HIVE-3646.3.patch.txt

   Original Estimate: 1m
  Remaining Estimate: 1m

 There are cases where it is desirable to move partitions between clusters. 
 Having to undo protection and then re-protect tables in order to delete 
 partitions from a source are multi-step and can leave us in a failed open 
 state where partition and table metadata is dirty. By implementing an 'rm 
 -rf'-like functionality, we can perform these operations atomically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535044#comment-13535044
 ] 

Mark Grover commented on HIVE-2693:
---

I added some test data and more tests to patch12 and found a few more 
interesting issues:
1. I added new tests related to where clauses. The where clause doesn't seem to 
be working as expected. I will take another look to see if I am doing something 
wrong but that's my first impression anyways.
2. I added more test data where the decimal column has values like 3.14 and 
3.140. This is an interesting case since we would like to maintain 
compatibility with MySQL where possible. If I remember correctly (from a few 
days ago when I tried it), MySQL considers 3.14 and 3.140 to be equivalent. 
Therefore, they would be considered the same in equi-join, where clauses, etc. 
This addition of a new data led me to see that order by is non-deterministic 
when done over a decimal column. Again something, we should look more into. 
FWIW, 3.14 are correctly being joined to 3.140 rows, so that's good!
3. I added some more test data with NULLs for the decimal column to make sure 
those were being read and handled properly when the table was being loaded.
I will submit a new patch with these added tests and data shortly.

[~hagleitn] About patch 12, do we need to have a separate sign (0) for zero? 
Would it not suffice for it to use the same sign as positive numbers? That 
would make it consistent with other datatypes as well. Been having a little 
busy morning but will review the rest of the patch shortly!

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-1.patch.txt, HIVE-2693-all.patch, HIVE-2693-fix.patch, 
 HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3815) hive table rename fails if filesystem cache is disabled

2012-12-18 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-3815:


Attachment: HIVE-3815.1.patch

HIVE-3815.1.patch - initial patch. Will submit another one that includes test 
case.

 hive table rename fails if filesystem cache is disabled
 ---

 Key: HIVE-3815
 URL: https://issues.apache.org/jira/browse/HIVE-3815
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.10.0

 Attachments: HIVE-3815.1.patch


 If fs.filesyste.impl.disable.cache  (eg fs.hdfs.impl.disable.cache) is set 
 to true, then table rename fails.
 The exception that gets thrown (though not logged!) is 
 {quote}
 Caused by: InvalidOperationException(message:table new location 
 hdfs://host1:8020/apps/hive/warehouse/t2 is on a different file system than 
 the old location hdfs://host1:8020/apps/hive/warehouse/t1. This operation is 
 not supported)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28825)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result$alter_table_resultStandardScheme.read(ThriftHiveMetastore.java:28811)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$alter_table_result.read(ThriftHiveMetastore.java:28753)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table(ThriftHiveMetastore.java:977)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table(ThriftHiveMetastore.java:962)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:208)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
 at $Proxy7.alter_table(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:373)
 ... 18 more
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Grover updated HIVE-2693:
--

Attachment: HIVE-2693-13.patch

Based on top off patch 12 but has more tests, some more interesting test data. 
Also, takes out an extraneous file Hive.g.orig which shouldn't be present in 
the patch.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3792) hive pom file has missing conf and scope mapping for compile configuration.

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535135#comment-13535135
 ] 

Hudson commented on HIVE-3792:
--

Integrated in Hive-0.10.0-SNAPSHOT-h0.20.1 #6 (See 
[https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/])
HIVE-3792 : hive pom file has missing conf and scope mapping for compile 
configuration. (Ashish Singh via Ashutosh Chauhan) (Revision 1423469)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423469
Files : 
* /hive/branches/branch-0.10/build-common.xml


 hive pom file has missing conf and scope mapping for compile configuration. 
 

 Key: HIVE-3792
 URL: https://issues.apache.org/jira/browse/HIVE-3792
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0

 Attachments: HIVE-3792.patch


 hive-0.10.0 pom file has missing conf and scope mapping for compile 
 configuration. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #6

2012-12-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/changes

Changes:

[hashutosh] HIVE-3792 : hive pom file has missing conf and scope mapping for 
compile configuration. (Ashish Singh via Ashutosh Chauhan)

--
[...truncated 51964 lines...]
[junit] 2012-12-18 10:35:42,279 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/artifact/hive/build/service/localscratchdir/hive_2012-12-18_10-35-38_946_2063500973527311965/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/artifact/hive/build/service/tmp/hive_job_log_jenkins_201212181035_152808521.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Copying file: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/artifact/hive/build/service/localscratchdir/hive_2012-12-18_10-35-43_798_2479025044602480479/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/artifact/hive/build/service/localscratchdir/hive_2012-12-18_10-35-43_798_2479025044602480479/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/6/artifact/hive/build/service/tmp/hive_job_log_jenkins_201212181035_836974642.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] 

[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535191#comment-13535191
 ] 

Mark Grover commented on HIVE-2693:
---

Ok, took a better look at a patch 12. Looks good overall.
On second thought, radix 10 is good for now. I am not going to vouch for 
unnecessary optimization unless there is evidence that we need to. 

[~hagleitn]
1. I understand the intent of +/- 2 you are doing to the sign bit. Is that 
really necessary? Sure, you can have a sign that is the same as terminator but 
the terminator is not considered until you get to the variable length part. 
What do you think?
2. Were you able to test out your serialization and deserialization code? We 
should make sure it works the way we expect? Do you need some help with testing 
that?

So, once we can fix the issues I mentioned in the previous comment (where 
clauses, deterministic order by), we should be good to review and post.
Thanks again!

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Fix Version/s: 0.9.0
   0.10.0
 Assignee: Arup Malakar
Affects Version/s: 0.10.0
 Release Note: [HIVE-3789] Added resolvePath method in ProxyFileSystem, 
so that the underlying filesystem resolvePath is not called. Fixed checkPath as 
well, since it was ignoring the schema and authority of the path being passed.
   Status: Patch Available  (was: Open)

Trash.moveToAppropriateTrash calls resolvePath, whose implementation is in the 
actual FileSystem behind ProxyFileSystem. resolvePath checks if the path being 
moved belongs to that filessystem or not. This check fails since it sees the 
proxy schema( pfile) in the path instead of its own schema (file). 
Overriding resolvePath to call the checkPath in ProxyFileSystem, fixed the 
problem.

Also the old implementation of checkPath was incorrect, as it throws away the 
schema/authority being passed before calling super. It should check if they 
match the proxy schema/authority.

The problem here was that ProxyFileSystem contains the FileSystem as a class 
member and it doesn't extend it. Because of this reason if a method in 
FileSystem calls another method in it, the method in FileSystem gets called not 
the overriden method in ProxyFileSystem. In this case resolvePath internally 
calls checkPath(), but the checkPath of RawFileSystem gets called instead of 
the overridden checkPath() in ProxyFileSystem. 

 Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
 

 Key: HIVE-3789
 URL: https://issues.apache.org/jira/browse/HIVE-3789
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Tests
Affects Versions: 0.9.0, 0.10.0
 Environment: Hadooop 0.23.5, JDK 1.6.0_31
Reporter: Chris Drome
Assignee: Arup Malakar
 Fix For: 0.10.0, 0.9.0


 Rolling back to before this patch shows that the unit tests are passing, 
 after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Attachment: HIVE-3789.trunk.1.patch
HIVE-3789.branch-0.9_1.patch

Trunk review: https://reviews.facebook.net/D7467

Branch-0.9 review: https://reviews.facebook.net/D7473

 Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
 

 Key: HIVE-3789
 URL: https://issues.apache.org/jira/browse/HIVE-3789
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Tests
Affects Versions: 0.9.0, 0.10.0
 Environment: Hadooop 0.23.5, JDK 1.6.0_31
Reporter: Chris Drome
Assignee: Arup Malakar
 Fix For: 0.9.0, 0.10.0

 Attachments: HIVE-3789.branch-0.9_1.patch, HIVE-3789.trunk.1.patch


 Rolling back to before this patch shows that the unit tests are passing, 
 after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535252#comment-13535252
 ] 

Gunther Hagleitner commented on HIVE-2693:
--

Mark, thanks for the additional tests, I'll take a closer look this afternoon.

To answer the questions:

1: I introduced +/-2 when I was at a point in the debugging stage where 
paranoia took over. I can remove that, it'll make the code more readable.
2: It is only implicitly tested in all the queries that use a reduce stage. I 
agree that a test of just that code would be good. Is there a place in the 
current unit tests that already does that/that I could use as a model?

Sign bit: I introduced a value for zero to avoid a factor of negative 
infinity. If you lump 0 into either the positive or negative bucket it would 
become the number that has an infinite number of zeros before the first 
non-zero digit (after the decimal point). MIN_INT might have been an option, 
but it seems cleaner to just make the sign have three states (-1,0,1). 
BigDecimal class in Java itself for instance more or less randomly defines 
precision of 0 (i.e.: number of unscaled digits) as 1. 

Non-deterministic order: 3.14 and 3.140 are indeed equal. Their representation 
should be exactly the same (1,1,314). Given that, I'm not sure how to 
enforce a deterministic order or even what that would be. Are you suggesting 
3.14 should always appear before 3.140?

I am worried about your comments about the where clause. I'll take a look at 
the tests. But you say it's not working right?


 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3817) Adding the name space for the maven task for the maven-publish target.

2012-12-18 Thread Ashish Singh (JIRA)
Ashish Singh created HIVE-3817:
--

 Summary: Adding the name space for the maven task for the 
maven-publish target.
 Key: HIVE-3817
 URL: https://issues.apache.org/jira/browse/HIVE-3817
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3817) Adding the name space for the maven task for the maven-publish target.

2012-12-18 Thread Ashish Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singh updated HIVE-3817:
---

Description: 
maven task for the maven-publish target is missing from the build.xml.
This is causing maven deploy issues.

 Adding the name space for the maven task for the maven-publish target.
 --

 Key: HIVE-3817
 URL: https://issues.apache.org/jira/browse/HIVE-3817
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0


 maven task for the maven-publish target is missing from the build.xml.
 This is causing maven deploy issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3817) Adding the name space for the maven task for the maven-publish target.

2012-12-18 Thread Ashish Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singh updated HIVE-3817:
---

Attachment: HIVE-3817.patch

Adding the patch, that works with hive trunk and hive-0.10.0 branch.


 Adding the name space for the maven task for the maven-publish target.
 --

 Key: HIVE-3817
 URL: https://issues.apache.org/jira/browse/HIVE-3817
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0

 Attachments: HIVE-3817.patch


 maven task for the maven-publish target is missing from the build.xml.
 This is causing maven deploy issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3817) Adding the name space for the maven task for the maven-publish target.

2012-12-18 Thread Ashish Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singh updated HIVE-3817:
---

Status: Patch Available  (was: Open)

Patch is available for trunk and 0.10.0 branch.

 Adding the name space for the maven task for the maven-publish target.
 --

 Key: HIVE-3817
 URL: https://issues.apache.org/jira/browse/HIVE-3817
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0

 Attachments: HIVE-3817.patch


 maven task for the maven-publish target is missing from the build.xml.
 This is causing maven deploy issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [DISCUSS] HCatalog becoming a subproject of Hive

2012-12-18 Thread Alan Gates
Carl, speaking just for myself and not as a representative of the HCat PPMC at 
this point, I am coming to agree with you that HCat integrating with Hive fully 
makes more sense.  

However, this makes the committer question even thornier.  Travis and Namit, I 
think the shepherd proposal needs to lay out a clear and time bounded path to 
committership for HCat committers.  Having HCat committers as second class Hive 
citizens for the long run will not be healthy.  I propose the following as a 
starting point for discussion:

All active HCat committers (those who have contributed or committed a patch in 
the last 6 months) will be made committers in the HCat portion only of Hive.  
In addition those committers will be assigned a particular shepherd who is a 
current Hive committer and who will be responsible for mentoring them towards 
full Hive committership.  As a part of this mentorship the HCat committer will 
review patches of other contributors, contribute patches to Hive (both inside 
and outside of HCatalog), respond to user issues on the mailing lists, etc.  It 
is intended that as a result of this mentorship program HCat committers can 
become full Hive committers in 6-9 months.  No new HCat only committers will be 
elected in Hive after this.  All Hive committers will automatically also have 
commit rights on HCatalog.

Alan.

On Dec 14, 2012, at 10:05 AM, Carl Steinbach wrote:

 On a functional level I don't think there is going to be much of a
 difference between the subproject option proposed by Travis and the other
 option where HCatalog becomes a TLP. In both cases HCatalog and Hive will
 have separate committers, separate code repositories, separate release
 cycles, and separate project roadmaps. Aside from ASF bureaucracy, I think
 the only major difference between the two options is that the subproject
 route will give the rest of the community the false impression that the two
 projects have coordinated roadmaps and a process to prevent overlapping
 functionality from appearing in both projects. Consequently, If these are
 the only two options then I would prefer that HCatalog become a TLP.
 
 On the other hand, I also agree with many of the sentiments that have
 already been expressed in this thread, namely that the two projects are
 closely related and that it would benefit the community at large if the two
 projects could be brought closer together. Up to this point the major
 source of pain for the HCatalog team has been the frequent necessity of
 making changes on both the Hive and HCatalog sides when implementing new
 features in HCatalog. This situation is compounded by the ASF requirement
 that release artifacts may not depend on snapshot artifacts from other ASF
 projects. Furthermore, if Hive adds a dependency on HCatalog then it will
 be subject to these same problems (in addition to the gross circular
 dependency!).
 
 I think the best way to avoid these problems is for HCatalog to become a
 Hive submodule. In this scenario HCatalog would exist as a subdirectory in
 the Hive repository and would be distributed as a Hive artifact in future
 Hive releases. In addition to solving the problems I mentioned earlier, I
 think this would also help to assuage the concerns of many Hive committers
 who don't want to see the MetaStore split out into a separate project.
 
 Thanks.
 
 Carl
 
 On Thu, Dec 13, 2012 at 7:59 PM, Namit Jain nj...@fb.com wrote:
 
 I am fine with this. Any hive committers who wants to volunteer to be
 a hcat shepherd is welcome.
 
 
 
 On 12/14/12 7:01 AM, Travis Crawford traviscrawf...@gmail.com wrote:
 
 Thanks for reviving this thread. Reviewing the comments everyone seems
 to agree HCatalog makes sense as a Hive subproject. I think that's
 great news for the Hadoop community.
 
 The discussion seems to have turned to one of committer permissions. I
 agree with the Hive folks sentiment that its something that must be
 earned. That said, I've found it challenging at times getting patches
 into Hive that would help earn taking on a hive committer
 responsibility.
 
 Proposal: if a couple hive committers can volunteer to be hcat
 shepherds, we can work with the shepherds when making hive changes in
 a timely manor. Conversely, we can help shepherd any hive committers
 who are interested in working more with hcat. There are certainly
 benefits to cross-committership, and this approach could help each
 other build a history of meaningful contributions and earn the
 privilege  responsibility of being committers.
 
 Thoughts?
 
 --travis
 
 
 
 On Thu, Dec 13, 2012 at 11:59 AM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
 I initially was a hesitant of hcatalog mostly because I imagined we
 would
 end up in a spot very similar to this.
 
 Namely the hcatlog folks are interested in making a metastore to support
 pig, hive, and map reduce. However I get the impression that many in
 hive
 do not care much to have a metastore that caters to everyone. Their
 needs
 are 

[jira] [Updated] (HIVE-3731) Ant target to create a Debian package

2012-12-18 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3731:
--

Attachment: D6879.2.patch

mbautin updated the revision [jira] [HIVE-3731] Ant target to create a Debian 
package.
Reviewers: njain, heyongqiang, raghotham, cwsteinbach, ashutoshc, JIRA, zshao, 
nzhang, jsichi, pauly, amareshwarisr

  Installing Hive into /usr/share/hive instead of /usr/share/hive-${version} in 
the Debian package.

REVISION DETAIL
  https://reviews.facebook.net/D6879

AFFECTED FILES
  build.xml
  deb/control
  ivy.xml
  ivy/libraries.properties

To: njain, heyongqiang, raghotham, cwsteinbach, ashutoshc, JIRA, zshao, nzhang, 
jsichi, pauly, amareshwarisr, mbautin


 Ant target to create a Debian package
 -

 Key: HIVE-3731
 URL: https://issues.apache.org/jira/browse/HIVE-3731
 Project: Hive
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D6879.1.patch, D6879.2.patch


 We need an Ant target to generate a Debian package with Hive binary 
 distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3818) Identify VOID datatype in generated table and notify the user accordingly

2012-12-18 Thread Gary Colman (JIRA)
Gary Colman created HIVE-3818:
-

 Summary: Identify VOID datatype in generated table and notify the 
user accordingly
 Key: HIVE-3818
 URL: https://issues.apache.org/jira/browse/HIVE-3818
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Gary Colman
Priority: Trivial


When using rcfile as a datastore, generating a table from a select statement 
with a null field results in a void data type, and throws an exception 
(Internal error: no LazyObject for VOID).

eg.
  set hive.default.fileformat=RCFILE;
  CREATE TABLE test_table AS SELECT NULL, key FROM src;




Make the message to the user a little more intuitive.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3731) Ant target to create a Debian package

2012-12-18 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3731:
--

Attachment: D6879.3.patch

mbautin updated the revision [jira] [HIVE-3731] Ant target to create a Debian 
package.
Reviewers: njain, heyongqiang, raghotham, cwsteinbach, ashutoshc, JIRA, zshao, 
nzhang, jsichi, pauly, amareshwarisr

  Setting the .deb package name to be hive-bin_version_all, where all means 
the architecture.

REVISION DETAIL
  https://reviews.facebook.net/D6879

AFFECTED FILES
  build.xml
  deb/control
  ivy.xml
  ivy/libraries.properties

To: njain, heyongqiang, raghotham, cwsteinbach, ashutoshc, JIRA, zshao, nzhang, 
jsichi, pauly, amareshwarisr, mbautin


 Ant target to create a Debian package
 -

 Key: HIVE-3731
 URL: https://issues.apache.org/jira/browse/HIVE-3731
 Project: Hive
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D6879.1.patch, D6879.2.patch, D6879.3.patch


 We need an Ant target to generate a Debian package with Hive binary 
 distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error

2012-12-18 Thread Mark Grover (JIRA)
Mark Grover created HIVE-3819:
-

 Summary: Creating a table on Hive without Hadoop daemons running 
returns a misleading error
 Key: HIVE-3819
 URL: https://issues.apache.org/jira/browse/HIVE-3819
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore
Reporter: Mark Grover


I was running hive without running the underlying hadoop daemon's running. 
Hadoop was configured to run in pseudo-distributed mode. However, when I tried 
to create a hive table, I got this rather misleading error:
{code}
FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask
{code}

We should look into making this error message less misleading (more about 
hadoop daemons not running instead of metastore client not being instantiable).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535407#comment-13535407
 ] 

Mark Grover commented on HIVE-2693:
---

1. Yeah, let's take the +/- 2 out please.
2. I am not aware of any such unit tests at the top of my head. I can try to 
poke around the code and see if I find something, will post if I find 
something. It probably wouldn't be until tomorrow.

I didn't realize that about the sign bit. Yeah, (-1,0,1) is good then.

Non-deterministic order: You are right they that are equal (and they should 
be). However, if you diff patch12 patch13, you will find the order in which 
order by displayed 1.0 and 1 got switched. The only thing that changed was me 
adding some data. All I would expect is for the order to remain consistent and 
deterministic which doesn't seem to be the case presently.

One thing that did stand out was in serialize()
{code}
BigDecimal dec = boi.getPrimitiveJavaObject(o).stripTrailingZeros();
{code}
stripTrailingZeros() seems interesting. I am just handwaving right now, need to 
look more before I can assert further but could this be a part of the problem?

Yeah, where clause is not working. The tests didn't give the expected output. 
Consequently, I tested using the Hive CLI (which I built after applying the 
patch) and it doesn't work on that either. You're welcome to take a look, I 
will try to find some time tonight or tomorrow morning to look into this as 
well.
These two problems might be related but look at this:
{code}
hive select cast(3.14 as decimal) from decimal_3 limit 1;
3.140124344978758017532527446746826171875
{code}
That doesn't look right to me:-)

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3401) Diversify grammar for split sampling

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535414#comment-13535414
 ] 

Hudson commented on HIVE-3401:
--

Integrated in Hive-trunk-h0.21 #1863 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1863/])
HIVE-3787 Regression introduced from HIVE-3401
(Navis via namit) (Revision 1423289)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423289
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientnegative/split_sample_wrong_format2.q
* /hive/trunk/ql/src/test/queries/clientpositive/split_sample.q
* /hive/trunk/ql/src/test/results/clientnegative/split_sample_out_of_range.q.out
* /hive/trunk/ql/src/test/results/clientnegative/split_sample_wrong_format.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/split_sample_wrong_format2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/split_sample.q.out


 Diversify grammar for split sampling
 

 Key: HIVE-3401
 URL: https://issues.apache.org/jira/browse/HIVE-3401
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
 HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch, HIVE-3401.D4821.6.patch, 
 HIVE-3401.D4821.7.patch


 Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
 some users wants to specify just the size of input. It can be easily 
 calculated with a few commands but it seemed good to support more grammars 
 something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3787) Regression introduced from HIVE-3401

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535417#comment-13535417
 ] 

Hudson commented on HIVE-3787:
--

Integrated in Hive-trunk-h0.21 #1863 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1863/])
HIVE-3787 Regression introduced from HIVE-3401
(Navis via namit) (Revision 1423289)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423289
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientnegative/split_sample_wrong_format2.q
* /hive/trunk/ql/src/test/queries/clientpositive/split_sample.q
* /hive/trunk/ql/src/test/results/clientnegative/split_sample_out_of_range.q.out
* /hive/trunk/ql/src/test/results/clientnegative/split_sample_wrong_format.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/split_sample_wrong_format2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/split_sample.q.out


 Regression introduced from HIVE-3401
 

 Key: HIVE-3787
 URL: https://issues.apache.org/jira/browse/HIVE-3787
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.11

 Attachments: HIVE-3787.D7275.1.patch


 By HIVE-3562, split_sample_out_of_range.q and split_sample_wrong_format.q are 
 not showing valid 'line:loc' information for error messages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3792) hive pom file has missing conf and scope mapping for compile configuration.

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535415#comment-13535415
 ] 

Hudson commented on HIVE-3792:
--

Integrated in Hive-trunk-h0.21 #1863 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1863/])
HIVE-3792 : hive pom file has missing conf and scope mapping for compile 
configuration. (Ashish Singh via Ashutosh Chauhan) (Revision 1423468)

 Result = SUCCESS
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423468
Files : 
* /hive/trunk/build-common.xml


 hive pom file has missing conf and scope mapping for compile configuration. 
 

 Key: HIVE-3792
 URL: https://issues.apache.org/jira/browse/HIVE-3792
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0
Reporter: Ashish Singh
Assignee: Ashish Singh
 Fix For: 0.10.0

 Attachments: HIVE-3792.patch


 hive-0.10.0 pom file has missing conf and scope mapping for compile 
 configuration. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3795) NPE in SELECT when WHERE-clause is an and/or/not operation involving null

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535416#comment-13535416
 ] 

Hudson commented on HIVE-3795:
--

Integrated in Hive-trunk-h0.21 #1863 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1863/])
HIVE-3795 NPE in SELECT when WHERE-clause is an and/or/not operation 
involving null
(Xiao Jiang via namit) (Revision 1423285)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423285
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java
* /hive/trunk/ql/src/test/queries/clientpositive/select_unquote_and.q
* /hive/trunk/ql/src/test/queries/clientpositive/select_unquote_not.q
* /hive/trunk/ql/src/test/queries/clientpositive/select_unquote_or.q
* /hive/trunk/ql/src/test/results/clientpositive/select_unquote_and.q.out
* /hive/trunk/ql/src/test/results/clientpositive/select_unquote_not.q.out
* /hive/trunk/ql/src/test/results/clientpositive/select_unquote_or.q.out


 NPE in SELECT when WHERE-clause is an and/or/not operation involving null
 -

 Key: HIVE-3795
 URL: https://issues.apache.org/jira/browse/HIVE-3795
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Xiao Jiang
Assignee: Xiao Jiang
Priority: Trivial
 Fix For: 0.11

 Attachments: HIVE-3795.1.patch.txt, HIVE-3795.2.patch.txt, 
 HIVE-3795.3.patch.txt, hive.3795.4.patch


 Sometimes users forget to quote date constants in queries. For example, 
 SELECT * FROM some_table WHERE ds = 2012-12-10 and ds = 2012-12-12; . In 
 such cases, if the WHERE-clause contains and/or/not operation, it would throw 
 NPE exception. That's because PcrExprProcFactory in ql/optimizer forgot to 
 check null. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1863 - Fixed

2012-12-18 Thread Apache Jenkins Server
Changes for Build #1861
[namit] HIVE-3646 Add 'IGNORE PROTECTION' predicate for dropping partitions
(Andrew Chalfant via namit)


Changes for Build #1863
[hashutosh] HIVE-3792 : hive pom file has missing conf and scope mapping for 
compile configuration. (Ashish Singh via Ashutosh Chauhan)

[namit] HIVE-3787 Regression introduced from HIVE-3401
(Navis via namit)

[namit] HIVE-3795 NPE in SELECT when WHERE-clause is an and/or/not operation 
involving null
(Xiao Jiang via namit)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1863)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1863/ to 
view the results.

Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #233

2012-12-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/233/



[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Labels: namenode_federation  (was: )

 HiveMetaStoreFsImpl is not compatible with hadoop viewfs
 

 Key: HIVE-3648
 URL: https://issues.apache.org/jira/browse/HIVE-3648
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.9.0, 0.10.0
Reporter: Kihwal Lee
Assignee: Arup Malakar
  Labels: namenode_federation
 Fix For: 0.11

 Attachments: HIVE_3648_branch_0.patch, HIVE_3648_branch_1.patch, 
 HIVE-3648-trunk-0.patch, HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch


 HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
 not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
 instead.  Please note that this method is not available in hadoop versions 
 earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3645:
---

Labels: namenode_federation  (was: )

 RCFileWriter does not implement the right function to support Federation
 

 Key: HIVE-3645
 URL: https://issues.apache.org/jira/browse/HIVE-3645
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.9.0, 0.10.0
 Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
Reporter: Viraj Bhat
Assignee: Arup Malakar
  Labels: namenode_federation
 Fix For: 0.11

 Attachments: HIVE_3645_branch_0.patch, HIVE_3645_trunk_0.patch


 Create a table using Hive DDL
 {code}
 CREATE TABLE tmp_hcat_federated_numbers_part_1 (
   id   int,  
   intnum   int,
   floatnum float
 )partitioned by (
   part1string,
   part2string
 )
 STORED AS rcfile
 LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
 {code}
 Populate it using Pig:
 {code}
 A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
 B = filter A by id =  500;
 C = foreach B generate (int)id, (int)intnum, (float)floatnum;
 store C into
 'default.tmp_hcat_federated_numbers_part_1'
 using org.apache.hcatalog.pig.HCatStorer
('part1=pig, part2=hcat_pig_insert',
 'id: int,intnum: int,floatnum: float');
 {code}
 Generates the following error when running on a Federated Cluster:
 {quote}
 2012-10-29 20:40:25,011 [main] ERROR
 org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
 exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
 Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
 getDefaultReplication on empty path is invalid
 at
 org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
 at org.apache.hadoop.hive.ql.io.RCFile$Writer.init(RCFile.java:723)
 at org.apache.hadoop.hive.ql.io.RCFile$Writer.init(RCFile.java:705)
 at
 org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
 at
 org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
 at
 org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
 at
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
 at
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.init(MapTask.java:587)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535453#comment-13535453
 ] 

Gunther Hagleitner commented on HIVE-2693:
--

Ordering: The stripTrailingZeros() does not change the value of the BigDecimal 
and therefore doesn't change the serialized output. I don't think this is the 
culprit. The order of the output depends on what hadoop does in the 
shuffle/sort phase. My guess is that since you changed the input the results 
that mapred came up with where different. Hadoop doesn't guarantee stable sort 
afaik.

Where clause: Yes, it's one and the same problem. We end up comparing 3.14 with 
3.14...1243... Still debugging but it seems we're going through double on 
the way to BigDecimal which changes the number.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3752) Add a non-sql API in hive to access data.

2012-12-18 Thread Nitay Joffe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535475#comment-13535475
 ] 

Nitay Joffe commented on HIVE-3752:
---

[~namitjain] I created an account (name: nitay). Can you give me edit 
permissions?

 Add a non-sql API in hive to access data.
 -

 Key: HIVE-3752
 URL: https://issues.apache.org/jira/browse/HIVE-3752
 Project: Hive
  Issue Type: Improvement
Reporter: Nitay Joffe
Assignee: Nitay Joffe

 We would like to add an input/output format for accessing Hive data in Hadoop 
 directly without having to use e.g. a transform. Using a transform
 means having to do a whole map-reduce step with its own disk accesses and its 
 imposed structure. It also means needing to have Hive be the base 
 infrastructure for the entire system being developed which is not the right 
 fit as we only need a small part of it (access to the data).
 So we propose adding an API level InputFormat and OutputFormat to Hive that 
 will make it trivially easy to select a table with partition spec and read 
 from / write to it. We chose this design to make it compatible with Hadoop so 
 that existing systems that work with Hadoop's IO API will just work out of 
 the box.
 We need this system for the Giraph graph processing system 
 (http://giraph.apache.org/) as running graph jobs which read/write from Hive 
 is a common use case.
 [~namitjain] [~aching] [~kevinwilfong] [~apresta]
 Input-side (HiveApiInputFormat) review: https://reviews.facebook.net/D7401

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535508#comment-13535508
 ] 

Kevin Wilfong commented on HIVE-3796:
-

The full test run passed.

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535519#comment-13535519
 ] 

Gunther Hagleitner commented on HIVE-2693:
--

Ok. I think I've convinced myself that the cast statement above is executed 
correctly, but the result is definitely not very intuitive.

When you say: cast (3.14 as decimal) you ask hive to convert the double 3.14 
to a bigdecimal in unlimited context, i.e. no rounding. Thus 3.141234... is 
correct. If you ask cast('3.14' as decimal) you get 3.14. Which is what you 
want. Rounding would be another option.

There was a similar issue with big integer: HIVE-2509. They 'solved' the 
problem by at least having a shorthand for big int literals (i.e.: 0L). 
BigDecimal should at the very least allow something like that too. E.g.: 3.14D 
or 3.14BD.

So, should we just introduce a literal shorthand and leave the behavior as is?

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535523#comment-13535523
 ] 

Gunther Hagleitner commented on HIVE-2693:
--

One more thing. It would be great to stop auto-converting double to decimal. 
WHERE decimal_column = 3.14 should preferably fail in the semantic analysis, 
so that any user would know that they have to cast from string or use a short 
hand.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535526#comment-13535526
 ] 

Mark Grover commented on HIVE-2693:
---

Fair enough. I would say let's not worry about the literal in this JIRA. We can 
create another JIRA and deal with it separately. I personally don't think it's 
a big deal because the 3.14 had it been read from a file in HDFS (when doing 
say 'LOCAL DATA IN PATH'), would have been correctly interpreted as a 
BigDecimal.

Good point about the hadoop shuffle/sort not being stable, I was hoping it was 
but I can see why it's not.

So, that leaves us with a hurdle and a half:
1: where clause
0.5: rigorous testing of serialization and deserialization code for Decimal 
type.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535532#comment-13535532
 ] 

Mark Grover commented on HIVE-2693:
---

That actually explains why where clauses aren't working because we are 
comparing BigDecimal with Double.
I tried the same where clause but with a string and that worked:
{code}
SELECT * FROM DECIMAL_3 WHERE key='3.14';
3.143
3.143
3.143
3.140   4
{code}

So, here is the million dollar question. Our users (understandably so) are 
going to forget about the quotes. Even if we introduce a new literal as a part 
of this JIRA, they are going to forget about the literal. I can think of two 
options:
1. Somehow promote the double 3.14 to be a BigDecimal 3.14 (maintaining 
precision). Maybe via a double-string-BigDecimal?
2. Throw an error like you suggested.

What do you think [~hagleitn]?

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3820) Consider creating a literal like D or BD for representing Decimal type constants

2012-12-18 Thread Mark Grover (JIRA)
Mark Grover created HIVE-3820:
-

 Summary: Consider creating a literal like D or BD for 
representing Decimal type constants
 Key: HIVE-3820
 URL: https://issues.apache.org/jira/browse/HIVE-3820
 Project: Hive
  Issue Type: Bug
Reporter: Mark Grover


When the HIVE-2693 gets committed, users are going to see this behavior:
{code}
hive select cast(3.14 as decimal) from decimal_3 limit 1;
3.140124344978758017532527446746826171875
{code}

That's intuitively incorrect but is the case because 3.14 (double) is being 
converted to BigDecimal because of which there is a precision mismatch.

We should consider creating a new literal for expressing constants of Decimal 
type as Gunther suggested in HIVE-2693.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3633) sort-merge join does not work with sub-queries

2012-12-18 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3633:


   Resolution: Fixed
Fix Version/s: 0.11
   Status: Resolved  (was: Patch Available)

Committed, thanks Namit.

 sort-merge join does not work with sub-queries
 --

 Key: HIVE-3633
 URL: https://issues.apache.org/jira/browse/HIVE-3633
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.11

 Attachments: hive.3633.10.patch, hive.3633.11.patch, 
 hive.3633.1.patch, hive.3633.2.patch, hive.3633.3.patch, hive.3633.4.patch, 
 hive.3633.5.patch, hive.3633.6.patch, hive.3633.7.patch, hive.3633.8.patch, 
 hive.3633.9.patch


 Consider the following query:
 create table smb_bucket_1(key int, value string) CLUSTERED BY (key) SORTED BY 
 (key) INTO 6 BUCKETS STORED AS TEXTFILE;
 create table smb_bucket_2(key int, value string) CLUSTERED BY (key) SORTED BY 
 (key) INTO 6 BUCKETS STORED AS TEXTFILE;
 -- load the above tables
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 explain
 select count(*) from
 (
 select /*+mapjoin(a)*/ a.key as key1, b.key as key2, a.value as value1, 
 b.value as value2
 from smb_bucket_1 a join smb_bucket_2 b on a.key = b.key)
 subq;
 The above query does not use sort-merge join. This would be very useful as we 
 automatically convert the queries to use sorting and bucketing properties for 
 join.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-12-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535548#comment-13535548
 ] 

Gunther Hagleitner commented on HIVE-2693:
--

Solution Number 1 would be nice, but I don't think we'll be able to do that. 
Any time we convert to double in an intermediate step we lose precision - and I 
don't think it can be recovered by using string or anything else afterwards. 
That seems to be just in the nature of floating point operations. We could 
round to fewer digits in the conversion for instance, but that will probably 
just cause issues elsewhere.

Not converting to double also seems a bad choice. That would change some very 
fundamental behavior in hive. All of a sudden select 3.14 * 2 ... would 
return BigDecimal instead of double. Not good.

So, I am in favor of throwing an error. At least we'll fail early and don't 
send users on a wild goose chase. Although it's not ideal either: The where 
clause issue wouldn't have happened for instance. But we probably don't want to 
bar explicit casts (cast (3.14 as decimal)) even though they probably won't 
do what anyone thinks they do. Or do we want to disallow them?


 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.10.0
Reporter: Carl Steinbach
Assignee: Prasad Mujumdar
 Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, 
 HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, 
 HIVE-2693-13.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, 
 HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, 
 HIVE-2693-take4.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3728) make optimizing multi-group by configurable

2012-12-18 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535585#comment-13535585
 ] 

Kevin Wilfong commented on HIVE-3728:
-

+1

 make optimizing multi-group by configurable
 ---

 Key: HIVE-3728
 URL: https://issues.apache.org/jira/browse/HIVE-3728
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3728.2.patch, hive.3728.3.patch


 This was done as part of https://issues.apache.org/jira/browse/HIVE-609.
 This should be configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3794) Oracle upgrade script for Hive is broken

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535612#comment-13535612
 ] 

Hudson commented on HIVE-3794:
--

Integrated in Hive-0.10.0-SNAPSHOT-h0.20.1 #7 (See 
[https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/7/])
HIVE-3794 : Oracle upgrade script for Hive is broken (Deepesh Khandelwal 
via Ashutosh Chauhan) (Revision 1423482)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423482
Files : 
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/derby/010-HIVE-3072.derby.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/derby/hive-schema-0.10.0.derby.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/mysql/010-HIVE-3072.mysql.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/mysql/hive-schema-0.10.0.mysql.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/oracle/010-HIVE-3072.oracle.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/oracle/hive-schema-0.10.0.oracle.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/postgres/010-HIVE-3072.postgres.sql
* 
/hive/branches/branch-0.10/metastore/scripts/upgrade/postgres/hive-schema-0.10.0.postgres.sql
* /hive/branches/branch-0.10/metastore/src/model/package.jdo


 Oracle upgrade script for Hive is broken
 

 Key: HIVE-3794
 URL: https://issues.apache.org/jira/browse/HIVE-3794
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
 Environment: Oracle 11g r2
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
Priority: Critical
 Fix For: 0.10.0

 Attachments: HIVE-3794.patch


 As part of Hive configuration for Oracle I ran the schema creation script for 
 Oracle. Here is what I observed when ran the script:
 % sqlplus hive/hive@xe
 SQL*Plus: Release 11.2.0.2.0 Production on Mon Dec 10 18:47:11 2012
 Copyright (c) 1982, 2011, Oracle.  All rights reserved.
 Connected to:
 Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production
 SQL @scripts/metastore/upgrade/oracle/hive-schema-0.10.0.oracle.sql;
 .
 ALTER TABLE SKEWED_STRING_LIST_VALUES ADD CONSTRAINT 
 SKEWED_STRING_LIST_VALUES_FK1 FOREIGN KEY (STRING_LIST_ID) REFERENCES 
 SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 .
 ALTER TABLE SKEWED_STRING_LIST_VALUES ADD CONSTRAINT 
 SKEWED_STRING_LIST_VALUES_FK1 FOREIGN KEY (STRING_LIST_ID) REFERENCES 
 SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 Table created.
 Table altered.
 Table altered.
 CREATE TABLE SKEWED_COL_VALUE_LOCATION_MAPPING
  *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 Table created.
 Table created.
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_PK PRIMARY KEY (SD_ID,STRING_LIST_ID_KID)
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_FK1 FOREIGN KEY (STRING_LIST_ID_KID) 
 REFERENCES SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY DEFERRED
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 ALTER TABLE SKEWED_COL_VALUE_LOCATION_MAPPING ADD CONSTRAINT 
 SKEWED_COL_VALUE_LOCATION_MAPPING_FK2 FOREIGN KEY (SD_ID) REFERENCES SDS 
 (SD_ID) INITIALLY DEFERRED
 *
 ERROR at line 1:
 {color:red}ORA-00972: identifier is too long{color}
 Table created.
 Table altered.
 ALTER TABLE SKEWED_VALUES ADD CONSTRAINT SKEWED_VALUES_FK1 FOREIGN KEY 
 (STRING_LIST_ID_EID) REFERENCES SKEWED_STRING_LIST (STRING_LIST_ID) INITIALLY 
 DEFERRED
   
  *
 ERROR at line 1:
 {color:red}ORA-00904: STRING_LIST_ID: invalid identifier{color}
 Basically there are two issues here with the Oracle sql script:
 (1) Table SKEWED_STRING_LIST is created with the column SD_ID. Later the 
 script tries to reference STRING_LIST_ID column in SKEWED_STRING_LIST 
 which is obviously not there. Comparing the sql with that for other flavors 
 it seems it should be STRING_LIST_ID.
 (2) Table name SKEWED_COL_VALUE_LOCATION_MAPPING is too long for Oracle 
 which limits 

[jira] [Assigned] (HIVE-3818) Identify VOID datatype in generated table and notify the user accordingly

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-3818:


Assignee: Gary Colman

 Identify VOID datatype in generated table and notify the user accordingly
 -

 Key: HIVE-3818
 URL: https://issues.apache.org/jira/browse/HIVE-3818
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Gary Colman
Assignee: Gary Colman
Priority: Trivial
   Original Estimate: 1h
  Remaining Estimate: 1h

 When using rcfile as a datastore, generating a table from a select statement 
 with a null field results in a void data type, and throws an exception 
 (Internal error: no LazyObject for VOID).
 eg.
   set hive.default.fileformat=RCFILE;
   CREATE TABLE test_table AS SELECT NULL, key FROM src;
 Make the message to the user a little more intuitive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3752) Add a non-sql API in hive to access data.

2012-12-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535626#comment-13535626
 ] 

Namit Jain commented on HIVE-3752:
--

[~nitay], can you try now ?

 Add a non-sql API in hive to access data.
 -

 Key: HIVE-3752
 URL: https://issues.apache.org/jira/browse/HIVE-3752
 Project: Hive
  Issue Type: Improvement
Reporter: Nitay Joffe
Assignee: Nitay Joffe

 We would like to add an input/output format for accessing Hive data in Hadoop 
 directly without having to use e.g. a transform. Using a transform
 means having to do a whole map-reduce step with its own disk accesses and its 
 imposed structure. It also means needing to have Hive be the base 
 infrastructure for the entire system being developed which is not the right 
 fit as we only need a small part of it (access to the data).
 So we propose adding an API level InputFormat and OutputFormat to Hive that 
 will make it trivially easy to select a table with partition spec and read 
 from / write to it. We chose this design to make it compatible with Hadoop so 
 that existing systems that work with Hadoop's IO API will just work out of 
 the box.
 We need this system for the Giraph graph processing system 
 (http://giraph.apache.org/) as running graph jobs which read/write from Hive 
 is a common use case.
 [~namitjain] [~aching] [~kevinwilfong] [~apresta]
 Input-side (HiveApiInputFormat) review: https://reviews.facebook.net/D7401

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3537) release locks at the end of move tasks

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3537:
-

Attachment: hive.3537.7.patch

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.1.patch, hive.3537.2.patch, hive.3537.3.patch, 
 hive.3537.4.patch, hive.3537.5.patch, hive.3537.6.patch, hive.3537.7.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3803:
-

Attachment: hive.3803.5.patch

 explain dependency should show the dependencies hierarchically in presence of 
 views
 ---

 Key: HIVE-3803
 URL: https://issues.apache.org/jira/browse/HIVE-3803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, 
 hive.3803.4.patch, hive.3803.5.patch


 It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views

2012-12-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535628#comment-13535628
 ] 

Namit Jain commented on HIVE-3803:
--

I have refreshed the above entry https://reviews.facebook.net/D7377 with 
code-only changes, and
explain_dependency.out. These are more important to review for correctness, 
others in the patch files
are log file changes only.

 explain dependency should show the dependencies hierarchically in presence of 
 views
 ---

 Key: HIVE-3803
 URL: https://issues.apache.org/jira/browse/HIVE-3803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, 
 hive.3803.4.patch, hive.3803.5.patch


 It should also include tables whose partitions are being accessed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3796) Multi-insert involving bucketed/sorted table turns off merging on all outputs

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3796:
-

Attachment: hive.3796.5.patch

Refreshed the patch, adding a new patch file
(for the record).

 Multi-insert involving bucketed/sorted table turns off merging on all outputs
 -

 Key: HIVE-3796
 URL: https://issues.apache.org/jira/browse/HIVE-3796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3796.1.patch.txt, HIVE-3796.2.patch.txt, 
 HIVE-3796.3.patch.txt, HIVE-3796.4.patch.txt, hive.3796.5.patch


 When a multi-insert query has at least one output that is bucketed, merging 
 is turned off for all outputs, rather than just the bucketed ones.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3814) Cannot drop partitions on table when using Oracle metastore

2012-12-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535640#comment-13535640
 ] 

Hudson commented on HIVE-3814:
--

Integrated in Hive-trunk-h0.21 #1864 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1864/])
HIVE-3814 : Cannot drop partitions on table when using Oracle metastore 
(Deepesh Khandelwal via Ashutosh Chauhan) (Revision 1423488)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1423488
Files : 
* /hive/trunk/metastore/scripts/upgrade/oracle/012-HIVE-1362.oracle.sql
* /hive/trunk/metastore/scripts/upgrade/oracle/hive-schema-0.10.0.oracle.sql
* /hive/trunk/metastore/scripts/upgrade/postgres/012-HIVE-1362.postgres.sql
* /hive/trunk/metastore/scripts/upgrade/postgres/hive-schema-0.10.0.postgres.sql


 Cannot drop partitions on table when using Oracle metastore
 ---

 Key: HIVE-3814
 URL: https://issues.apache.org/jira/browse/HIVE-3814
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
 Environment: Oracle 11g r2
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
Priority: Critical
 Fix For: 0.10.0

 Attachments: HIVE-3814.patch


 Create a table with a partition. Try to drop the partition or the table 
 containing the partition. Following error is seen:
 FAILED: Error in metadata: 
 MetaException(message:javax.jdo.JDODataStoreException: Error executing JDOQL 
 query SELECT 
 'org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics' AS 
 NUCLEUS_TYPE,THIS.AVG_COL_LEN,THIS.COLUMN_NAME,THIS.COLUMN_TYPE,THIS.DB_NAME,THIS.DOUBLE_HIGH_VALUE,THIS.DOUBLE_LOW_VALUE,THIS.LAST_ANALYZED,THIS.LONG_HIGH_VALUE,THIS.LONG_LOW_VALUE,THIS.MAX_COL_LEN,THIS.NUM_DISTINCTS,THIS.NUM_FALSES,THIS.NUM_NULLS,THIS.NUM_TRUES,THIS.PARTITION_NAME,THIS.TABLE_NAME,THIS.CS_ID
  FROM PART_COL_STATS THIS LEFT OUTER JOIN PARTITIONS 
 THIS_PARTITION_PARTITION_NAME ON THIS.PART_ID = 
 THIS_PARTITION_PARTITION_NAME.PART_ID WHERE 
 THIS_PARTITION_PARTITION_NAME.PART_NAME = ? AND THIS.DB_NAME = ? AND 
 THIS.TABLE_NAME = ? : ORA-00904: THIS.PARTITION_NAME: invalid 
 identifier
 The problem here is that the column PARTITION_NAME that the query is 
 referring to in table PART_COL_STATS is non-existent. Looking at the hive 
 schema scripts for mysql  derby, this should be PARTITION_NAME. Postgres 
 also suffers from the same problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3212) ODBC build framework on Linux and windows

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3212:
---

Fix Version/s: (was: 0.10.0)

 ODBC build framework on Linux and windows
 -

 Key: HIVE-3212
 URL: https://issues.apache.org/jira/browse/HIVE-3212
 Project: Hive
  Issue Type: Sub-task
  Components: ODBC
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3397) PartitionPruner should log why it is not pushing the filter down to JDO

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3397:
---

Assignee: Navis

 PartitionPruner should log why it is not pushing the filter down to JDO
 ---

 Key: HIVE-3397
 URL: https://issues.apache.org/jira/browse/HIVE-3397
 Project: Hive
  Issue Type: Sub-task
  Components: Diagnosability, Logging, Query Processor
Reporter: Carl Steinbach
Assignee: Navis
  Labels: PartitionPruner, QueryOptimizer
 Fix For: 0.10.0

 Attachments: HIVE-3397.D5691.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3764) Support metastore version consistency check

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3764:
---

Fix Version/s: (was: 0.10.0)

 Support metastore version consistency check
 ---

 Key: HIVE-3764
 URL: https://issues.apache.org/jira/browse/HIVE-3764
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-3764-1.patch


 Today there's no version/compatibility information stored in hive metastore. 
 Also the datanucleus configuration property to automatically create missing 
 tables is enabled by default. If you happen to start an older or newer hive 
 or don't run the correct upgrade scripts during migration, the metastore 
 would end up corrupted. The autoCreate schema is not always sufficient to 
 upgrade metastore when migrating to newer release. It's not supported with 
 all databases. Besides the migration often involves altering existing table, 
 changing or moving data etc.
 Hence it's very useful to have some consistency check to make sure that hive 
 is using correct metastore and for production systems the schema is not 
 automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3725) Add support for pulling HBase columns with prefixes

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3725:
---

Fix Version/s: (was: 0.10.0)

 Add support for pulling HBase columns with prefixes
 ---

 Key: HIVE-3725
 URL: https://issues.apache.org/jira/browse/HIVE-3725
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-3725.1.patch.txt


 Current HBase Hive integration supports reading many values from the same row 
 by specifying a column family. And specifying just the column family can pull 
 in all qualifiers within the family.
 We should add in support to be able to specify a prefix for the qualifier and 
 all columns that start with the prefix would automatically get pulled in. A 
 wildcard support would be ideal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3672) Support altering partition column type in Hive

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3672:
---

Fix Version/s: (was: 0.10.0)

 Support altering partition column type in Hive
 --

 Key: HIVE-3672
 URL: https://issues.apache.org/jira/browse/HIVE-3672
 Project: Hive
  Issue Type: Improvement
  Components: CLI, SQL
Affects Versions: 0.10.0
Reporter: Jingwei Lu
Assignee: Jingwei Lu
 Attachments: HIVE-3672.1.patch.txt, HIVE-3672.2.patch.txt

   Original Estimate: 72h
  Remaining Estimate: 72h

 Currently, Hive does not allow altering partition column types.  As we've 
 discouraged users from using non-string partition column types, this presents 
 a problem for users who want to change there partition columns to be strings, 
 they have to rename their table, create a new table, and copy all the data 
 over.
 To support this via the CLI, adding a command like ALTER TABLE table_name 
 PARTITION COLUMN (column_name new type);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3666) implement a udf to keep hive session alive for certain amount of time

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3666:
---

Fix Version/s: (was: 0.10.0)

 implement a udf to keep hive session alive for certain amount of time
 -

 Key: HIVE-3666
 URL: https://issues.apache.org/jira/browse/HIVE-3666
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.10.0
Reporter: Johnny Zhang
Assignee: Johnny Zhang
 Attachments: HIVE-3666.patch, HIVE-3666.patch


 To make testing issues like HIVE-3590 convenient, we can implement a UDF to 
 keep hive session alive for a given time. The patch introduce a new UDF 
 sleep() which does this without introducing any data/load to cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3635:
---

Fix Version/s: (was: 0.10.0)

  allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for 
 the boolean hive type
 ---

 Key: HIVE-3635
 URL: https://issues.apache.org/jira/browse/HIVE-3635
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.9.0
Reporter: Alexander Alten-Lorenz
Assignee: Alexander Alten-Lorenz
 Attachments: HIVE-3635.patch


 interpret t as true and f as false for boolean types. PostgreSQL exports 
 represent it that way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3558) UDF LEFT(string,position) to HIVE

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3558:
---

Fix Version/s: (was: 0.10.0)

 UDF  LEFT(string,position) to HIVE
 --

 Key: HIVE-3558
 URL: https://issues.apache.org/jira/browse/HIVE-3558
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.9.0
Reporter: Aruna Babu
Priority: Minor

 Introduction
   UDF (User Defined Function) to obtain the left most 'n' characters from 
 a string in  HIVE. 
 Relevance
   Current releases of Hive lacks a function which would returns the 
 leftmost len characters from the string str, or NULL if any argument is NULL. 
   
 The function LEFT(string,length)  would return the leftmost 'n' characters 
 from the string , or NULL if any argument is NULL which would be useful while 
 using HiveQL. This would find its use  in all the technical aspects where the 
 concept of strings are used.
 Functionality :-
 Function Name: LEFT(string,length) 

 Returns the leftmost length characters from the string  or NULL if any 
 argument is NULL.  
 Example: hiveSELECT LEFT('https://www.irctc.co.in',5);
   - 'https'
 Usage :-
 Case 1: To query a table to find details based on an https request
 Table :-Transaction
 Request_id|date|period_id|url_name
 0001|01/07/2012|110001|https://www.irctc.co.in
 0002|02/07/2012|110001|https://nextstep.tcs.com
 0003|03/07/2012|110001|https://www.hdfcbank.com
 0005|01/07/2012|110001|http://www.lmnm.co.in
 0006|08/07/2012|110001|http://nextstart.com
 0007|10/07/2012|110001|https://netbanking.icicibank.com
 0012|21/07/2012|110001|http://www.people.co.in
 0026|08/07/2012|110001|http://nextprobs.com
 00023|25/07/2012|110001|https://netbanking.canarabank.com
 Query : select * from transaction where LEFT(url_name,5)='https';
 Result :-
 0001|01/07/2012|110001|https://www.irctc.com
 0002|02/07/2012|110001|https://nextstep.tcs.com  
 0003|03/07/2012|110001|https://www.hdfcbank.com
 0007|10/07/2012|110001|https://netbanking.icicibank.com
 00023|25/07/2012|110001|https://netbanking.canarabank.com

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3821) RCFile does not work with lazyBinarySerDe

2012-12-18 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3821:


 Summary: RCFile does not work with lazyBinarySerDe
 Key: HIVE-3821
 URL: https://issues.apache.org/jira/browse/HIVE-3821
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Namit Jain
Assignee: Namit Jain


create table tst(key string, value string) row format serde 
'org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe' stored as rcfile;  

insert overwrite table tst select * from src;

gets an error:

Caused by: java.lang.UnsupportedOperationException: Currently the writer can 
only accept BytesRefArrayWritable
at org.apache.hadoop.hive.ql.io.RCFile$Writer.append(RCFile.java:882)
at 
org.apache.hadoop.hive.ql.io.RCFileOutputFormat$2.write(RCFileOutputFormat.java:140)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:637)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547)
... 5 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3542) Can not use DB Qualified Name in Order By, Sort By, Distribute By, and Cluster By

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3542:
---

Fix Version/s: (was: 0.10.0)

 Can not use DB Qualified Name in Order By, Sort By, Distribute By, and 
 Cluster By
 -

 Key: HIVE-3542
 URL: https://issues.apache.org/jira/browse/HIVE-3542
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Zhenxiao Luo
Assignee: Zhenxiao Luo

 CREATE DATABASE db1;
 CREATE TABLE db1.t(a INT, b INT);
 SELECT * FROM db1.t ORDER BY db1.t.a;
 FAILED: SemanticException [Error 10004]: Line 3:29 Invalid table alias or 
 column reference 'db1': (possible column names are: a, b)
 SELECT * FROM db1.t SORT BY db1.t.a;
 FAILED: SemanticException [Error 10004]: Line 3:28 Invalid table alias or 
 column reference 'db1': (possible column names are: a, b)
 SELECT * FROM db1.t CLUSTER BY db1.t.a;
 FAILED: SemanticException [Error 10004]: Line 3:31 Invalid table alias or 
 column reference 'db1': (possible column names are: a, b)
 SELECT * FROM db1.t DISTRIBUTE BY db1.t.a;
 FAILED: SemanticException [Error 10004]: Line 3:34 Invalid table alias or 
 column reference 'db1': (possible column names are: a, b)
 alias is working OK:
 SELECT * FROM db1.t t ORDER BY t.a;
 OK
 SELECT * FROM db1.t t SORT BY t.a;
 OK
 SELECT * FROM db1.t t CLUSTER BY t.a;
 OK
 SELECT * FROM db1.t t DISTRIBUTE BY t.a;
 OK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3236) allow column names to be prefixed by table alias in select all queries

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3236:
---

Fix Version/s: (was: 0.10.0)

 allow column names to be prefixed by table alias in select all queries
 --

 Key: HIVE-3236
 URL: https://issues.apache.org/jira/browse/HIVE-3236
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.10.0, 0.9.1
Reporter: Keegan Mosley
Priority: Minor
 Attachments: HIVE-3236.1.patch.txt


 When using CREATE TABLE x AS SELECT ... where the select joins tables with 
 hundreds of columns it is not a simple task to resolve duplicate column name 
 exceptions (particularly with self-joins). The user must either manually 
 specify aliases for all duplicate columns (potentially hundreds) or write a 
 script to generate the data set in a separate select query, then create the 
 table and load the data in.
 There should be some conf flag that would allow queries like
 create table joined as select one.\*, two.\* from mytable one join mytable 
 two on (one.duplicate_field = two.duplicate_field1);
 to create a table with columns one_duplicate_field and two_duplicate_field.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3309) drop partition doesnot work for mixture of string and non-string columns for non-equality operatior

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3309:
---

Fix Version/s: (was: 0.10.0)

 drop partition doesnot work for mixture of string and non-string columns for 
 non-equality operatior
 ---

 Key: HIVE-3309
 URL: https://issues.apache.org/jira/browse/HIVE-3309
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Affects Versions: 0.10.0, 0.9.1
 Environment: SuSE 11 SP 1
 Hadoop Cluster + Hive
Reporter: rohithsharma
Priority: Minor
  Labels: patch
 Attachments: HIVE-3309.patch, HIVE-3309.patch


 There is still problem in drop partition columns if the partition columns 
 are mixture of string and non-string.
 There is behavioural change after fixing HIVE-3063.
 Before fix
 ==
 create table ptestfilter (a string, b int) partitioned by (c string, d int);
 alter table ptestfilter add partition (c='1', d=2);
 alter table ptestFilter add partition (c='2', d=1);
 alter table ptestfilter drop partition (c'2'); //this will execute fine
 After fix
 ==
 create table ptestfilter (a string, b int) partitioned by (c string, d int);
 alter table ptestfilter add partition (c='1', d=2);
 alter table ptestFilter add partition (c='2', d=1);
 alter table ptestfilter drop partition (c'2'); //this will fail to execute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3166) The Hive JDBC driver should accept hive conf and hive variables via connection URL

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3166:
---

Fix Version/s: (was: 0.10.0)

 The Hive JDBC driver should accept hive conf and hive variables via 
 connection URL
 --

 Key: HIVE-3166
 URL: https://issues.apache.org/jira/browse/HIVE-3166
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.9.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
  Labels: api-addition
 Attachments: HIVE-3166-3.patch


 The JDBC driver supports running embedded hive. The Hive CLI can accept 
 configuration and hive settings on command line that can be passed down. But 
 the JDBC driver currently doesn't support this.
 Its also required for SQLLine CLI support since that is a JDBC application. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3036) hive should support BigDecimal datatype

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-3036.


   Resolution: Duplicate
Fix Version/s: (was: 0.10.0)

Dupe of HIVE-2936

 hive should support BigDecimal datatype
 ---

 Key: HIVE-3036
 URL: https://issues.apache.org/jira/browse/HIVE-3036
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Affects Versions: 0.7.1, 0.8.0, 0.8.1
Reporter: Anurag Tangri

 hive has support for big int but people have use cases where they need 
 decimal precision to a big value.
 Values in question are like decimal(x,y).
 for eg. decimal of form (17,6) which cannot be represented by float/double.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3104) Predicate pushdown doesn't work with multi-insert statements using LATERAL VIEW

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3104:
---

Fix Version/s: (was: 0.10.0)

 Predicate pushdown doesn't work with multi-insert statements using LATERAL 
 VIEW
 ---

 Key: HIVE-3104
 URL: https://issues.apache.org/jira/browse/HIVE-3104
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.9.0
 Environment: Apache Hive 0.9.0, Apache Hadoop 0.20.205.0
Reporter: Mark Grover

 Predicate pushdown seems to work for single-insert queries using LATERAL 
 VIEW. It also seems to work for multi-insert queries *not* using LATERAL 
 VIEW. However, it doesn't work for multi-insert queries using LATERAL VIEW.
 Here are some examples. In the below examples, I make use of the fact that a 
 query with no partition filtering when run under hive.mapred.mode=strict 
 fails.
 --Table creation and population
 DROP TABLE IF EXISTS test;
 CREATE TABLE test (col1 arrayint, col2 int)  PARTITIONED BY (part_col int);
 INSERT OVERWRITE TABLE test PARTITION (part_col=1) SELECT array(1,2), 
 count(*) FROM test;
 INSERT OVERWRITE TABLE test PARTITION (part_col=2) SELECT array(2,4,6), 
 count(*) FROM test;
 -- Query 1
 -- This succeeds (using LATERAL VIEW with single insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2);
 -- Query 2
 -- This succeeds (NOT using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT col1
 WHERE (part_col=2);
 -- Query 3
 -- This fails (using LATERAL VIEW with multi-insert)
 set hive.mapred.mode=strict;
 FROM partition_test
 LATERAL VIEW explode(col1) tmp AS exp_col1
 INSERT OVERWRITE DIRECTORY '/test/1'
 SELECT exp_col1
 WHERE (part_col=2)
 INSERT OVERWRITE DIRECTORY '/test/2'
 SELECT exp_col1
 WHERE (part_col=2);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3806) Ptest failing due to Argument list too long errors

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3806:
---

Fix Version/s: 0.11

 Ptest failing due to Argument list too long errors
 

 Key: HIVE-3806
 URL: https://issues.apache.org/jira/browse/HIVE-3806
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor
 Fix For: 0.11

 Attachments: HIVE-3806.1.patch.txt


 ptest creates a really huge shell command to delete from each test host those 
 .q files that it should not be running. For TestCliDriver, the command has 
 become long enough that it is over the threshold allowed by the shell. We 
 should rewrite it so that the same semantics is captured in a shorter command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3783) stats19.q is failing on trunk

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3783:
---

Fix Version/s: 0.10.0

 stats19.q is failing on trunk
 -

 Key: HIVE-3783
 URL: https://issues.apache.org/jira/browse/HIVE-3783
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Ashutosh Chauhan
Assignee: Kevin Wilfong
 Fix For: 0.10.0

 Attachments: HIVE-3783.1.patch.txt


 This test-case was introduced in HIVE-3750 and is failing since as soon as it 
 was introduced. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3771) HIVE-3750 broke TestParse

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3771:
---

Fix Version/s: 0.11

 HIVE-3750 broke TestParse
 -

 Key: HIVE-3771
 URL: https://issues.apache.org/jira/browse/HIVE-3771
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.11
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11

 Attachments: HIVE-3771.1.patch.txt


 see title

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3770) Test cases's broken in TestParse

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3770:
---

Fix Version/s: 0.11

 Test cases's broken in TestParse
 

 Key: HIVE-3770
 URL: https://issues.apache.org/jira/browse/HIVE-3770
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Fix For: 0.11


 20 TestParse test cases are broken. It's introduced from D7017 HIVE-3750:
 20 Test cases
 =
 testParse_case_sensitivity
 testParse_groupby1
 testParse_input1
 testParse_input2
 testParse_input3
 testParse_input4
 testParse_input5
 testParse_input6
 testParse_input7
 testParse_input9
 testParse_input_testsequencefile
 testParse_join1
 testParse_join2
 testParse_join3
 testParse_sample2
 testParse_sample3
 testParse_sample4
 testParse_sample5
 testParse_sample6
 testParse_sample7
 sample error
 
 {quote}
 ant test -Dtestcase=TestParse -Dqfile=groupby1.q
 [junit] diff -a ../build/ql/test/logs/positive/groupby1.q.out 
 ../ql/src/test/results/compiler/parse/groupby1.q.out
 [junit] diff -a -b ../build/ql/test/logs/positive/groupby1.q.xml 
 ../ql/src/test/results/compiler/plan/groupby1.q.xml
 [junit] 1224,1226d1223
 [junit]  void property=maxStatsKeyPrefixLength 
 [junit]   int200/int 
 [junit]  /void 
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3783) stats19.q is failing on trunk

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3783:
---

Fix Version/s: (was: 0.10.0)
   0.11

 stats19.q is failing on trunk
 -

 Key: HIVE-3783
 URL: https://issues.apache.org/jira/browse/HIVE-3783
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11
Reporter: Ashutosh Chauhan
Assignee: Kevin Wilfong
 Fix For: 0.11

 Attachments: HIVE-3783.1.patch.txt


 This test-case was introduced in HIVE-3750 and is failing since as soon as it 
 was introduced. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3766) Enable adding hooks to hive meta store init

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3766:
---

Fix Version/s: 0.11

 Enable adding hooks to hive meta store init
 ---

 Key: HIVE-3766
 URL: https://issues.apache.org/jira/browse/HIVE-3766
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jean Xu
Assignee: Jean Xu
 Fix For: 0.11

 Attachments: jira3766.txt


 We will enable hooks to be added to init HMSHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3762) Minor fix for 'tableName' in Hive.g

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3762:
---

Fix Version/s: 0.11

 Minor fix for 'tableName' in Hive.g
 ---

 Key: HIVE-3762
 URL: https://issues.apache.org/jira/browse/HIVE-3762
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11

 Attachments: HIVE-3762.D7143.1.patch


 Current definition for 'tableName' is (db=Identifier DOT)? tab=Identifier. 
 If user specifies value default. for it, hive parser accepts default as 
 table name and reserves . for next token but it's not valid.
 Really trivial but it is small needed part for improving query 
 auto-completion (I'm doing it).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3747) Provide hive operation name for hookContext

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3747:
---

Fix Version/s: 0.11

 Provide hive operation name for hookContext
 ---

 Key: HIVE-3747
 URL: https://issues.apache.org/jira/browse/HIVE-3747
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Sudhanshu Arora
Assignee: Shreepadma Venugopalan
 Fix For: 0.11

 Attachments: HIVE-3747.1.patch.txt


 The hookContext exposed through ExecuteWithHookContext, does not provide the 
 name of the Hive operation. 
 The following public API should be added in HookContext.
 public String getOperationName() {
 return SessionState.get().getHiveOperation().name();
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

2012-12-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3552:
-

Attachment: hive.3552.8.patch

 HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
 high number of grouping set keys
 -

 Key: HIVE-3552
 URL: https://issues.apache.org/jira/browse/HIVE-3552
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, 
 hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, 
 hive.3552.8.patch


 This is a follow up for HIVE-3433.
 Had a offline discussion with Sambavi - she pointed out a scenario where the
 implementation in HIVE-3433 will not scale. Assume that the user is performing
 a cube on many columns, say '8' columns. So, each row would generate 256 rows
 for the hash table, which may kill the current group by implementation.
 A better implementation would be to add an additional mr job - in the first 
 mr job perform the group by assuming there was no cube. Add another mr job, 
 where
 you would perform the cube. The assumption is that the group by would have 
 decreased the output data significantly, and the rows would appear in the 
 order of
 grouping keys which has a higher probability of hitting the hash table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3703) Hive Query Explain Plan JSON not being created properly

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3703:
---

Fix Version/s: 0.11

 Hive Query Explain Plan JSON not being created properly
 ---

 Key: HIVE-3703
 URL: https://issues.apache.org/jira/browse/HIVE-3703
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Mayank Garg
Assignee: Mayank Garg
Priority: Minor
 Fix For: 0.11

 Attachments: HIVE-3703.2.patch.txt

   Original Estimate: 12h
  Remaining Estimate: 12h

 There is an option to generate a JSON query plan for the hive query, however, 
 the JSON being created is invalid and json_decoders are unable to decode it

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3702) Renaming table changes table location scheme/authority

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3702:
---

Fix Version/s: 0.11

 Renaming table changes table location scheme/authority
 --

 Key: HIVE-3702
 URL: https://issues.apache.org/jira/browse/HIVE-3702
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.9.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11

 Attachments: HIVE-3702.1.patch.txt, HIVE-3702.2.patch.txt


 Renaming a table changes the location of the table to the default location of 
 the database, followed by the table name.  This means that if the default 
 location of the database uses a different scheme/authority, an exception will 
 get thrown attempting to move the data.
 Instead, the table's location should be made the default location of the 
 database followed by the table name, but using the original location's scheme 
 and authority.
 This only applies for managed tables, and there is already a check to ensure 
 the new location doesn't already exist.
 This is analogous to what was done for partitions in HIVE-2875

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3695) TestParse breaks due to HIVE-3675

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3695:
---

Fix Version/s: 0.10.0

 TestParse breaks due to HIVE-3675
 -

 Key: HIVE-3695
 URL: https://issues.apache.org/jira/browse/HIVE-3695
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3695.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3687) smb_mapjoin_13.q is nondeterministic

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3687:
---

Fix Version/s: 0.10.0

 smb_mapjoin_13.q is nondeterministic
 

 Key: HIVE-3687
 URL: https://issues.apache.org/jira/browse/HIVE-3687
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3687.1.patch.txt


 smb_mapjoin_13.q is missing an ORDER BY clause it in its queries

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3647) map-side groupby wrongly due to HIVE-3432

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3647:
---

Fix Version/s: 0.10.0

 map-side groupby wrongly due to HIVE-3432
 -

 Key: HIVE-3647
 URL: https://issues.apache.org/jira/browse/HIVE-3647
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3647.1.patch, hive.3647.2.patch, hive.3647.3.patch, 
 hive.3647.4.patch, hive.3647.5.patch, hive.3647.6.patch, hive.3647.7.patch, 
 hive.3647.8.patch


 There seems to be a bug due to HIVE-3432.
 We are converting the group by to a map side group by after only looking at
 sorting columns. This can give wrong results if the data is sorted and
 bucketed by different columns.
 Add some tests for that scenario, verify and fix any issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3594) When Group by Partition Column Type is Timestamp or STRING Which Format contains HH:MM:SS, It will occur URISyntaxException

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3594:
---

Fix Version/s: 0.11

 When Group by Partition Column Type is Timestamp or STRING Which Format 
 contains HH:MM:SS, It will occur URISyntaxException
 -

 Key: HIVE-3594
 URL: https://issues.apache.org/jira/browse/HIVE-3594
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Daisy.Yuan
Assignee: Navis
 Fix For: 0.11

 Attachments: HIVE-3594.D6081.1.patch, HIVE-3594.D6081.2.patch


 create table test (no int, name string) partitioned by (pts string) row 
 format delimited fields terminated by ' '; 
 load data local inpath '/opt/files/groupbyts1.txt' into table test 
 partition(pts='12:11:30');
 load data local inpath '/opt/files/groupbyts2.txt' into table test 
 partition(pts='21:25:12');
 load data local inpath '/opt/files/groupbyts3.txt' into table test 
 partition(pts='12:11:30');
 load data local inpath '/opt/files/groupbyts4.txt' into table test 
 partition(pts='21:25:12');
 when I execute “select * from test group by pts;”, it will occur as follows 
 exception.
  at org.apache.hadoop.fs.Path.initialize(Path.java:157)
 at org.apache.hadoop.fs.Path.init(Path.java:135)
 at 
 org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1667)
 at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:432)
 at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:400)
 at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:93)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:135)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1329)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1121)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:954)
 at 
 org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
 at 
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:630)
 at 
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:618)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
 fake-path-metadata-only-query-default.test{pts=12:11:30%7D
 at java.net.URI.checkPath(URI.java:1788)
 at java.net.URI.init(URI.java:734)
 at org.apache.hadoop.fs.Path.initialize(Path.java:154)
 ... 19 more
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 When PhysicalOptimizer optimizes GroupByOperator, according to default 
 parameters hive.optimize.metadataonly = true, MetadataOnlyOptimizer will be 
 enabled. The MetadataOnlyOptimizer will change the partition alias desc. The 
 partition alies hdfs://ip:9000/user/hive/warehouse/test/pts=12%3A11%3A30 is 
 changed into  
 fake-path-metadata-only-query-default.test{pts=12:11:30}. When construct uri 
 through new partition alies, it must occur java.net.URISyntaxException. 
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3581) get_json_object and json_tuple return null in the presence of new line characters

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3581:
---

Fix Version/s: 0.10.0

 get_json_object and json_tuple return null in the presence of new line 
 characters
 -

 Key: HIVE-3581
 URL: https://issues.apache.org/jira/browse/HIVE-3581
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.10.0

 Attachments: HIVE-3581.1.patch.txt


 This was introduced when these functions were updated to use Jackson.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3573) Revert HIVE-3268

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3573:
---

Fix Version/s: 0.10.0

 Revert HIVE-3268
 

 Key: HIVE-3573
 URL: https://issues.apache.org/jira/browse/HIVE-3573
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3573.1.patch, hive.3573.2.patch


 This patch introduces some code which can breaks 
 distribute/order/cluster/sort by.  We should revert this code until it can be 
 fixed (HIVE-3572).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3564) hivetest.py: revision number and applied patch

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3564:
---

Fix Version/s: 0.11

 hivetest.py: revision number and applied patch
 --

 Key: HIVE-3564
 URL: https://issues.apache.org/jira/browse/HIVE-3564
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Ivan Gorbachev
Assignee: Ivan Gorbachev
 Fix For: 0.11

 Attachments: hive-3564.0.patch.txt


 It's required to add new option for hivetest.py which will allow to show base 
 revision number and applied patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3557) Access to external URLs in hivetest.py

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3557:
---

Fix Version/s: 0.10.0

 Access to external URLs in hivetest.py 
 ---

 Key: HIVE-3557
 URL: https://issues.apache.org/jira/browse/HIVE-3557
 Project: Hive
  Issue Type: Improvement
Reporter: Ivan Gorbachev
Assignee: Ivan Gorbachev
 Fix For: 0.10.0

 Attachments: jira-3557.0.patch, jira-3557.1.patch


 1. Migrate all non-HTTP urls to HTTP.
 2. Add HTTP_PROXY support

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3556) Test Path - Alias for explain extended

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3556:
---

Fix Version/s: 0.10.0

 Test Path - Alias for explain extended
 -

 Key: HIVE-3556
 URL: https://issues.apache.org/jira/browse/HIVE-3556
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Fix For: 0.10.0

 Attachments: HIVE-3556.patch.1, HIVE-3556.patch.2


 Test framework masks output of Path - Alias for explain extended. This 
 makes it impossible to verify the output is right. 
 Design is to add a new entry Truncated Path - Alias to MapredWork. It has 
 the same content as Path - Alias except the prefix including file schema 
 and temp dir is removed. The following config will be used for prefix-removal:
 METASTOREWAREHOUSE(hive.metastore.warehouse.dir, /user/hive/warehouse),
 This will keep Path - Alias intact and also test it's result is right.
 The first use case is to verify list bucketing query's result is right.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3554) Hive List Bucketing - Query logic

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3554:
---

Fix Version/s: 0.10.0

 Hive List Bucketing - Query logic
 -

 Key: HIVE-3554
 URL: https://issues.apache.org/jira/browse/HIVE-3554
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Fix For: 0.10.0

 Attachments: HIVE-3554.patch.1, HIVE-3554.patch.10, 
 HIVE-3554.patch.11, HIVE-3554.patch.12, HIVE-3554.patch.2, HIVE-3554.patch.3, 
 HIVE-3554.patch.4, HIVE-3554.patch.5, HIVE-3554.patch.7, HIVE-3554.patch.8, 
 HIVE-3554.patch.9


 This is part of efforts for list bucketing feature: 
 https://cwiki.apache.org/Hive/listbucketing.html
 This patch includes:
 1. Query logic: hive chooses right sub-directory instead of partition 
 directory.
 2. alter table grammar which is required to support query logic
 This patch doesn't include list bucketing DML. Main reasons:
 1. risk. w/o DML, this patch won't impact any existing hive regression 
 features since no touch on any data manipulation so that very low risk.
 2. manageability. w/ DML, patch is getting bigger and hard to review. 
 Removing DML, it's easy to review.
 We still disable hive feature by default since DML is not in yet.
 DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2732) Reduce Sink deduplication fails if the child reduce sink is followed by a join

2012-12-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2732:


  Component/s: Query Processor
Affects Version/s: 0.10.0
Fix Version/s: 0.10.0

 Reduce Sink deduplication fails if the child reduce sink is followed by a join
 --

 Key: HIVE-2732
 URL: https://issues.apache.org/jira/browse/HIVE-2732
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Navis
 Fix For: 0.10.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2732.D1809.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2732.D1809.2.patch


 set hive.optimize.reducededuplication=true;
 set hive.auto.convert.join=true;
 explain select * from (select * from src distribute by key sort by key) a 
 join src b on a.key = b.key;
 fails with the following exception
 java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.SelectOperator 
 cannot be cast to org.apache.hadoop.hive.ql.exec.ReduceSinkOperator
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:313)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.genMapJoinOpAndLocalWork(MapJoinProcessor.java:226)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:174)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:287)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:68)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:72)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7019)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7312)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:48)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 If hive.auto.convert.join is set to false, it produces an incorrect plan 
 where the two halves of the join are processed in two separate map reduce 
 tasks, and the reducers of these two tasks both contain the join operator 
 resulting in an exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3533) ZooKeeperHiveLockManager does not respect the option to keep locks alive even after the current session has closed

2012-12-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3533:
---

Assignee: Matt Martin

 ZooKeeperHiveLockManager does not respect the option to keep locks alive even 
 after the current session has closed
 --

 Key: HIVE-3533
 URL: https://issues.apache.org/jira/browse/HIVE-3533
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.9.0
Reporter: Matt Martin
Assignee: Matt Martin
Priority: Minor
 Attachments: HIVE-3533.1.patch.txt


 The HiveLockManager interface defines the following method:
 public ListHiveLock lock(ListHiveLockObj objs,
   boolean keepAlive) throws LockException;
 ZooKeeperHiveLockManager implements HiveLockManager, but the current 
 implementation of the lock method never actually references the keepAlive 
 parameter.  As a result, all of the locks acquired by the lock method are 
 ephemeral.  In other words, Zookeeper-based locks only exist as long as the 
 underlying Zookeeper session exists.  As soon as the Zookeeper session ends, 
 any Zookeeper-based locks are automatically released.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >