[jira] [Commented] (HIVE-6298) Add config flag to turn off fetching partition stats

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880798#comment-13880798
 ] 

Gunther Hagleitner commented on HIVE-6298:
--

Thanks for taking up adding this particular flag as well, [~prasanth_j]. 
[~leftylev] thanks for the reminder. I wasn't sure that this particular flag 
needs to be documented (mostly for developers when debugging), but I think 
you're right. It should get added.

 Add config flag to turn off fetching partition stats
 

 Key: HIVE-6298
 URL: https://issues.apache.org/jira/browse/HIVE-6298
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6298.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-5883) Plan is deserialized more often than necessary on Tez (in container reuse case)

2014-01-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-5883.
--

Resolution: Fixed

This had been committed to branch before merge. Missed updating the ticket. My 
apologies.

 Plan is deserialized more often than necessary on Tez (in container reuse 
 case)
 ---

 Key: HIVE-5883
 URL: https://issues.apache.org/jira/browse/HIVE-5883
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-5883.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-5861) Fix exception in multi insert statement on Tez

2014-01-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner resolved HIVE-5861.
--

Resolution: Fixed

This had been committed to branch before merge. Missed updating the ticket. My 
apologies.

 Fix exception in multi insert statement on Tez
 --

 Key: HIVE-5861
 URL: https://issues.apache.org/jira/browse/HIVE-5861
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-5861.1.patch


 Multi insert statements that have multiple group by clauses aren't handled 
 properly in tez.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6218) Stats for row-count not getting updated with Tez insert + dbclass=counter

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880805#comment-13880805
 ] 

Gunther Hagleitner commented on HIVE-6218:
--

Looked into this. Turns out that the problem is that we're running analyze as 
MR via Tez' yarn runner. That one drops the required counters on the floor. 
Best fix is to probably just do the stats computation directly in Tez. I'll get 
on that.

 Stats for row-count not getting updated with Tez insert + dbclass=counter
 -

 Key: HIVE-6218
 URL: https://issues.apache.org/jira/browse/HIVE-6218
 Project: Hive
  Issue Type: Bug
  Components: Statistics, Tez
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gunther Hagleitner
Priority: Minor

 Inserting data into hive with Tez,  the stats on row-count is not getting 
 updated when using the counter dbclass.
 To reproduce, run ANALYZE TABLE store_sales COMPUTE STATISTICS; with tez as 
 the execution engine.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6268) Network resource leak with HiveClientCache when using HCatInputFormat

2014-01-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13880842#comment-13880842
 ] 

Lefty Leverenz commented on HIVE-6268:
--

Where should *hcatalog.hive.client.cache.disabled* be documented, besides the 
release notes?

AFAIK, none of the configuration properties in HCatConstants.java are mentioned 
in the wiki.  HCatConstants itself is only mentioned a couple of times in 
Notification for a New Partition:  
https://cwiki.apache.org/confluence/display/Hive/HCatalog+Notification#HCatalogNotification-NotificationforaNewPartition.

 Network resource leak with HiveClientCache when using HCatInputFormat
 -

 Key: HIVE-6268
 URL: https://issues.apache.org/jira/browse/HIVE-6268
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6268.patch


 HCatInputFormat has a cache feature that allows HCat to cache hive client 
 connections to the metastore, so as to not keep reinstantiating a new hive 
 server every single time. This uses a guava cache of hive clients, which only 
 evicts entries from cache on the next write, or by manually managing the 
 cache.
 So, in a single threaded case, where we reuse the hive client, the cache 
 works well, but in a massively multithreaded case, where each thread might 
 perform one action, and then is never used, there are no more writes to the 
 cache, and all the clients stay alive, thus keeping ports open.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Justin Coffey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Coffey updated HIVE-5783:


Attachment: HIVE-5783.patch

The updated patch.  This fixes incorrect behavior when using HiveInputSplits.  
Regression tests have been added as a qtest (parquet_partitioned.q).

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17262: HIVE-6246: Sign(a) UDF is not supported for decimal type

2014-01-24 Thread Xuefu Zhang


 On Jan. 23, 2014, 10:57 p.m., Mohammad Islam wrote:
  Overall looks good.
  Is it possible to add a .q test or append to an existing .q test?

Actually unit test such as that provided is preferrable. Additional .q test 
only prelongs the build process while not providing much value for this case.


 On Jan. 23, 2014, 10:57 p.m., Mohammad Islam wrote:
  ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFSign.java, line 31
  https://reviews.apache.org/r/17262/diff/1/?file=436440#file436440line31
 
  Minor issue: does testByte() is a good name?

A result of CP. Will change it.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17262/#review32677
---


On Jan. 23, 2014, 8:57 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17262/
 ---
 
 (Updated Jan. 23, 2014, 8:57 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6246
 https://issues.apache.org/jira/browse/HIVE-6246
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Please see the JIRA description. It's believed that this has nevered worked. 
 Added a method in UDFSign class to handle Decimal data type to make it work. 
 This method returns INT instead of doulbe to be inline with other data bases.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 0fef283 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFSign.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17262/diff/
 
 
 Testing
 ---
 
 Unit test is added.
 
 
 Thanks,
 
 Xuefu Zhang
 




Could anyone take a look at the testing issue described in HIVE-6293?

2014-01-24 Thread Xuefu Zhang
Thanks,
Xuefu


[jira] [Commented] (HIVE-6293) Not all minimr tests are executed or reported in precommit test run

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881078#comment-13881078
 ] 

Brock Noland commented on HIVE-6293:


I just created 
https://cwiki.apache.org/confluence/display/Hive/MiniMR+and+PTest2 to expain 
this. I linked it from the Precommit page and the FAQ.

 Not all minimr tests are executed or reported in precommit test run
 ---

 Key: HIVE-6293
 URL: https://issues.apache.org/jira/browse/HIVE-6293
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that not all q file tests for minimr are executed or reported in the 
 pre-commit test run. Here is an example:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/
 This might be due to ptest because manually running test TestMinimrCliDriver 
 seems executing all tests. My last run shows 38 tests run, with 8 test 
 failures.
 This is identified in HIVE-5446. It needs to be fixed to have broader 
 coverage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6293) Not all minimr tests are executed or reported in precommit test run

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881083#comment-13881083
 ] 

Brock Noland commented on HIVE-6293:


If we wanted to improve this, we would either:

Make a change here 
https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/java/org/apache/hive/ptest/execution/conf/TestParser.java#L99
 to parse the property which specifies the minimr tests out of the pom.xml
or
Move the minimr tests to a different directory.

I'd be in favor of the file move.

 Not all minimr tests are executed or reported in precommit test run
 ---

 Key: HIVE-6293
 URL: https://issues.apache.org/jira/browse/HIVE-6293
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that not all q file tests for minimr are executed or reported in the 
 pre-commit test run. Here is an example:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/
 This might be due to ptest because manually running test TestMinimrCliDriver 
 seems executing all tests. My last run shows 38 tests run, with 8 test 
 failures.
 This is identified in HIVE-5446. It needs to be fixed to have broader 
 coverage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Could anyone take a look at the testing issue described in HIVE-6293?

2014-01-24 Thread Brock Noland
Done


On Fri, Jan 24, 2014 at 9:22 AM, Xuefu Zhang xzh...@cloudera.com wrote:

 Thanks,
 Xuefu




-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


[jira] [Updated] (HIVE-3872) MAP JOIN for VIEW throws NULL pointer exception error

2014-01-24 Thread Yoni Ben-Meshulam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yoni Ben-Meshulam updated HIVE-3872:


Summary: MAP JOIN  for VIEW throws NULL pointer exception error  (was: MAP 
JOIN  for VIEW thorws NULL pointer exception error)

 MAP JOIN  for VIEW throws NULL pointer exception error
 --

 Key: HIVE-3872
 URL: https://issues.apache.org/jira/browse/HIVE-3872
 Project: Hive
  Issue Type: Bug
  Components: Views
Reporter: Santosh Achhra
Assignee: Navis
Priority: Critical
  Labels: HINTS, MAPJOIN
 Fix For: 0.11.0

 Attachments: HIVE-3872.D7965.1.patch


 I have created a view  as shown below. 
 CREATE VIEW V1 AS
 select /*+ MAPJOIN(t1) ,MAPJOIN(t2)  */ t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, 
 t2.f2, t2.f3 from TABLE1 t1 join TABLE t2 on ( t1.f2= t2.f2 and t1.f3 = t2.f3 
 and t1.f4 = t2.f4 ) group by t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3
 View get created successfully however when I execute below mentioned SQL or 
 any SQL on the view  get NULLPOINTER exception error
 hive select count (*) from V1;
 FAILED: NullPointerException null
 hive
 Is there anything wrong with the view creation ?
 Next I created view without MAPJOIN hints 
 CREATE VIEW V1 AS
 select  t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3 from TABLE1 t1 join 
 TABLE t2 on ( t1.f2= t2.f2 and t1.f3 = t2.f3 and t1.f4 = t2.f4 ) group by 
 t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3
 Before executing select SQL I excute set  hive.auto.convert.join=true; 
 I am getting beloow mentioned warnings
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 And I see from log that total 5 mapreduce jobs are started however when don't 
 set auto.convert.join to true, I see only 3 mapreduce jobs getting invoked.
 Total MapReduce jobs = 5
 Ended Job = 1116112419, job is filtered out (removed at runtime).
 Ended Job = -33256989, job is filtered out (removed at runtime).
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881201#comment-13881201
 ] 

Brock Noland commented on HIVE-5843:


Great to hear!

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-5843-src-only.patch, HIVE-5843.2.patch, 
 HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.4-src.patch, 
 HIVE-5843.4.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17061: HIVE-5783 - Native Parquet Support in Hive

2014-01-24 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17061/
---

(Updated Jan. 24, 2014, 5:55 p.m.)


Review request for hive.


Changes
---

Latest patch rebased.


Bugs: HIVE-5783
https://issues.apache.org/jira/browse/HIVE-5783


Repository: hive-git


Description
---

Adds native parquet support hive


Diffs (updated)
-

  data/files/parquet_create.txt PRE-CREATION 
  data/files/parquet_partitioned.txt PRE-CREATION 
  pom.xml 41f5337 
  ql/pom.xml 7087a4c 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableRecordConverter.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/DeepParquetHiveMapInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveArrayInspector.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/StandardParquetHiveMapInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetByteInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetPrimitiveInspectorFactory.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetShortInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetStringInspector.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BigDecimalWritable.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BinaryWritable.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 13d0a56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f83c15d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c15c4b5 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 4147503 
  ql/src/java/parquet/hive/DeprecatedParquetInputFormat.java PRE-CREATION 
  ql/src/java/parquet/hive/DeprecatedParquetOutputFormat.java PRE-CREATION 
  ql/src/java/parquet/hive/MapredParquetInputFormat.java PRE-CREATION 
  ql/src/java/parquet/hive/MapredParquetOutputFormat.java PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java 
PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetInputFormat.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
 PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestAbstractParquetMapInspector.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestDeepParquetHiveMapInspector.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveArrayInspector.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestStandardParquetHiveMapInspector.java
 PRE-CREATION 
  ql/src/test/queries/clientpositive/parquet_create.q PRE-CREATION 
  ql/src/test/queries/clientpositive/parquet_partitioned.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_create.q.out PRE-CREATION 
  

[jira] [Updated] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-5783:
---

Attachment: HIVE-5783.patch

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881208#comment-13881208
 ] 

Brock Noland commented on HIVE-5783:


Uploaded the latest patch rebased on trunk.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6287) batchSize computation in Vectorized ORC reader can cause BufferUnderFlowException when PPD is enabled

2014-01-24 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6287:
--

Description: 
nextBatch() method that computes the batchSize is only aware of stripe 
boundaries. This will not work when predicate pushdown (PPD) in ORC is enabled 
as PPD works at row group level (stripe contains multiple row groups). By 
default, row group stride is 1. When PPD is enabled, some row groups may 
get eliminated. After row group elimination, disk ranges are computed based on 
the selected row groups. If batchSize computation is not aware of this, it will 
lead to BufferUnderFlowException (reading beyond disk range). Following 
scenario should illustrate it more clearly

{code}
|- STRIPE 1 
|
|-- row grp 1 --|-- row grp 2 --|-- row grp 3 --|-- row grp 4 --|-- row grp 5 
--|
|- diskrange 1 -|   |- diskrange 2 
-|
^
 (marker)   
{code}

diskrange1 will have 2 rows and diskrange 2 will have 1 rows. Since 
nextBatch() was not aware of row groups and hence the diskranges, it tries to 
read 1024 values from the end of diskrange 1 where it should only read 2 % 
1024 = 544 values. This will result in BufferUnderFlowException.

To fix this, a marker is placed at the end of each range and batchSize is 
computed accordingly. {code}batchSize = 
Math.min(VectorizedRowBatch.DEFAULT_SIZE, (markerPosition - rowInStripe));{code}

  was:
nextBatch() method that computes the batchSize is only aware of stripe 
boundaries. This will not work when PPD in ORC is enabled as PPD works at row 
group level (stripe contains multiple row groups). By default, row group stride 
is 1. When PPD is enabled, some row groups may get eliminated. After row 
group elimination, disk ranges are computed based on the selected row groups. 
If batchSize computation is not aware of this, it will lead to 
BufferUnderFlowException (reading beyond disk range). Following scenario should 
illustrate it more clearly

{code}
|- STRIPE 1 
|
|-- row grp 1 --|-- row grp 2 --|-- row grp 3 --|-- row grp 4 --|-- row grp 5 
--|
|- diskrange 1 -|   |- diskrange 2 
-|
^
 (marker)   
{code}

diskrange1 will have 2 rows and diskrange 2 will have 1 rows. Since 
nextBatch() was not aware of row groups and hence the diskranges, it tries to 
read 1024 values from the end of diskrange 1 where it should only read 2 % 
1024 = 544 values. This will result in BufferUnderFlowException.

To fix this, a marker is placed at the end of each range and batchSize is 
computed accordingly. {code}batchSize = 
Math.min(VectorizedRowBatch.DEFAULT_SIZE, (markerPosition - rowInStripe));{code}


 batchSize computation in Vectorized ORC reader can cause 
 BufferUnderFlowException when PPD is enabled
 -

 Key: HIVE-6287
 URL: https://issues.apache.org/jira/browse/HIVE-6287
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile, vectorization
 Attachments: HIVE-6287.1.patch, HIVE-6287.WIP.patch


 nextBatch() method that computes the batchSize is only aware of stripe 
 boundaries. This will not work when predicate pushdown (PPD) in ORC is 
 enabled as PPD works at row group level (stripe contains multiple row 
 groups). By default, row group stride is 1. When PPD is enabled, some row 
 groups may get eliminated. After row group elimination, disk ranges are 
 computed based on the selected row groups. If batchSize computation is not 
 aware of this, it will lead to BufferUnderFlowException (reading beyond disk 
 range). Following scenario should illustrate it more clearly
 {code}
 |- STRIPE 1 
 |
 |-- row grp 1 --|-- row grp 2 --|-- row grp 3 --|-- row grp 4 --|-- row grp 5 
 --|
 |- diskrange 1 -|   |- diskrange 
 2 -|
 ^
  (marker)   
 {code}
 diskrange1 will have 2 rows and diskrange 2 will have 1 rows. Since 
 nextBatch() was not aware of row groups and hence the diskranges, it tries to 
 read 1024 values from the end of diskrange 1 where it should only read 2 
 % 1024 = 544 values. This will result in BufferUnderFlowException.
 To fix this, a 

[jira] [Commented] (HIVE-6287) batchSize computation in Vectorized ORC reader can cause BufferUnderFlowException when PPD is enabled

2014-01-24 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881225#comment-13881225
 ] 

Eric Hanson commented on HIVE-6287:
---

I think that by PPD you mean predicate pushdown. This was not immediately 
obvious to me. I edited it into the description. It's a good idea to define 
acronyms on first use. Thanks!

 batchSize computation in Vectorized ORC reader can cause 
 BufferUnderFlowException when PPD is enabled
 -

 Key: HIVE-6287
 URL: https://issues.apache.org/jira/browse/HIVE-6287
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile, vectorization
 Attachments: HIVE-6287.1.patch, HIVE-6287.WIP.patch


 nextBatch() method that computes the batchSize is only aware of stripe 
 boundaries. This will not work when predicate pushdown (PPD) in ORC is 
 enabled as PPD works at row group level (stripe contains multiple row 
 groups). By default, row group stride is 1. When PPD is enabled, some row 
 groups may get eliminated. After row group elimination, disk ranges are 
 computed based on the selected row groups. If batchSize computation is not 
 aware of this, it will lead to BufferUnderFlowException (reading beyond disk 
 range). Following scenario should illustrate it more clearly
 {code}
 |- STRIPE 1 
 |
 |-- row grp 1 --|-- row grp 2 --|-- row grp 3 --|-- row grp 4 --|-- row grp 5 
 --|
 |- diskrange 1 -|   |- diskrange 
 2 -|
 ^
  (marker)   
 {code}
 diskrange1 will have 2 rows and diskrange 2 will have 1 rows. Since 
 nextBatch() was not aware of row groups and hence the diskranges, it tries to 
 read 1024 values from the end of diskrange 1 where it should only read 2 
 % 1024 = 544 values. This will result in BufferUnderFlowException.
 To fix this, a marker is placed at the end of each range and batchSize is 
 computed accordingly. {code}batchSize = 
 Math.min(VectorizedRowBatch.DEFAULT_SIZE, (markerPosition - 
 rowInStripe));{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6298) Add config flag to turn off fetching partition stats

2014-01-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881244#comment-13881244
 ] 

Sergey Shelukhin commented on HIVE-6298:


lgtm

 Add config flag to turn off fetching partition stats
 

 Key: HIVE-6298
 URL: https://issues.apache.org/jira/browse/HIVE-6298
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6298.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6157:
---

Status: Open  (was: Patch Available)

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.03.patch, HIVE-6157.nogen.patch, HIVE-6157.nogen.patch, 
 HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6157:
---

Attachment: HIVE-6157.03.patch

exact same patch, HiveQA won't run

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.03.patch, HIVE-6157.03.patch, HIVE-6157.nogen.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6157:
---

Status: Patch Available  (was: Open)

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.03.patch, HIVE-6157.03.patch, HIVE-6157.nogen.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


precommit builds backed up

2014-01-24 Thread Eric Hanson (BIG DATA)

Precommit builds for patches submitted yesterday didn't start. Is there an ETA 
for when they will start, and do we need to take action to make sure the builds 
start?

Thanks,
Eric


Re: precommit builds backed up

2014-01-24 Thread Brock Noland
Hi,

Apache Jenkins (which starts our peoples via
https://builds.apache.org/job/PreCommit-Admin/) is having issues.

However, Apache Jenkins can be bypassed with the following command

curl 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/buildWithParameters?token=$TOKENISSUE_NUM=$JIRA


curl 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/buildWithParameters?token=XISSUE_NUM=5783


I can share the token with committers privately.

Brock


On Fri, Jan 24, 2014 at 12:30 PM, Eric Hanson (BIG DATA) 
eric.n.han...@microsoft.com wrote:


 Precommit builds for patches submitted yesterday didn't start. Is there an
 ETA for when they will start, and do we need to take action to make sure
 the builds start?

 Thanks,
 Eric




-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org


[jira] [Commented] (HIVE-6248) HCatReader/Writer should hide Hadoop and Hive classes

2014-01-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881293#comment-13881293
 ] 

Ashutosh Chauhan commented on HIVE-6248:


+1 
I think we should also create a wiki doc for it to document usage of this api. 
In case an overall page for this already exists somewhere, we  need to update 
it with changes.

 HCatReader/Writer should hide Hadoop and Hive classes
 -

 Key: HIVE-6248
 URL: https://issues.apache.org/jira/browse/HIVE-6248
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6248.patch


 HCat's HCatReader and HCatWriter interfaces expose Hadoop classes 
 Configuration and InputSplit, as well as HCatInputSplit.  This exposes users 
 to changes over Hadoop or HCatalog versions.  It also makes it harder to some 
 day move this interface to use WebHCat, which we'd like to do.  The eventual 
 goal is for this interface to not require any other jars (no Hadoop, Hive, 
 etc.)  As a first step to this the references to Hadoop and HCat classes in 
 the interface should be hidden.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6287) batchSize computation in Vectorized ORC reader can cause BufferUnderFlowException when PPD is enabled

2014-01-24 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6287:
-

Attachment: HIVE-6287.2.patch

Reuploading the same patch for HIVE QA to pick up.

Thanks [~ehans] for the update to description.

 batchSize computation in Vectorized ORC reader can cause 
 BufferUnderFlowException when PPD is enabled
 -

 Key: HIVE-6287
 URL: https://issues.apache.org/jira/browse/HIVE-6287
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile, vectorization
 Attachments: HIVE-6287.1.patch, HIVE-6287.2.patch, HIVE-6287.WIP.patch


 nextBatch() method that computes the batchSize is only aware of stripe 
 boundaries. This will not work when predicate pushdown (PPD) in ORC is 
 enabled as PPD works at row group level (stripe contains multiple row 
 groups). By default, row group stride is 1. When PPD is enabled, some row 
 groups may get eliminated. After row group elimination, disk ranges are 
 computed based on the selected row groups. If batchSize computation is not 
 aware of this, it will lead to BufferUnderFlowException (reading beyond disk 
 range). Following scenario should illustrate it more clearly
 {code}
 |- STRIPE 1 
 |
 |-- row grp 1 --|-- row grp 2 --|-- row grp 3 --|-- row grp 4 --|-- row grp 5 
 --|
 |- diskrange 1 -|   |- diskrange 
 2 -|
 ^
  (marker)   
 {code}
 diskrange1 will have 2 rows and diskrange 2 will have 1 rows. Since 
 nextBatch() was not aware of row groups and hence the diskranges, it tries to 
 read 1024 values from the end of diskrange 1 where it should only read 2 
 % 1024 = 544 values. This will result in BufferUnderFlowException.
 To fix this, a marker is placed at the end of each range and batchSize is 
 computed accordingly. {code}batchSize = 
 Math.min(VectorizedRowBatch.DEFAULT_SIZE, (markerPosition - 
 rowInStripe));{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-2558) Timestamp comparisons don't work

2014-01-24 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881331#comment-13881331
 ] 

Jason Dere commented on HIVE-2558:
--

This has changed as of HIVE-5204 - the comparison is done as string

 Timestamp comparisons don't work
 

 Key: HIVE-2558
 URL: https://issues.apache.org/jira/browse/HIVE-2558
 Project: Hive
  Issue Type: Bug
Reporter: Robert Surówka

 I may be missing something, but:
 After performing:
 create table rrt (r timestamp);
 insert into table rrt select '1970-01-01 00:00:01' from src limit 1;
 Following queries give undesirable results:
 select * from rrt where r in ('1970-01-01 00:00:01');
 select * from rrt where r in (0); 
 select * from rrt where r = 0; 
 select * from rrt where r = '1970-01-01 00:00:01';
 At least for the first two, the reason may be the lack of timestamp in 
 numericTypes Map from FunctionRegistry.java (591) . Yet whether we really 
 want to have a linear hierarchy of primitive types in the end, is another 
 question.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-2558) Timestamp comparisons don't work

2014-01-24 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-2558.
--

   Resolution: Fixed
Fix Version/s: 0.12.0

 Timestamp comparisons don't work
 

 Key: HIVE-2558
 URL: https://issues.apache.org/jira/browse/HIVE-2558
 Project: Hive
  Issue Type: Bug
Reporter: Robert Surówka
 Fix For: 0.12.0


 I may be missing something, but:
 After performing:
 create table rrt (r timestamp);
 insert into table rrt select '1970-01-01 00:00:01' from src limit 1;
 Following queries give undesirable results:
 select * from rrt where r in ('1970-01-01 00:00:01');
 select * from rrt where r in (0); 
 select * from rrt where r = 0; 
 select * from rrt where r = '1970-01-01 00:00:01';
 At least for the first two, the reason may be the lack of timestamp in 
 numericTypes Map from FunctionRegistry.java (591) . Yet whether we really 
 want to have a linear hierarchy of primitive types in the end, is another 
 question.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: VOTE: Remove phabricator instructions from hive-development guide (wiki), officially only support Apache's review board.

2014-01-24 Thread Brock Noland
Good call, I made a very basic fix and noted on the Phabricator page that
it's no longer used.

Brock


On Thu, Jan 23, 2014 at 3:32 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 The wiki still has Phabricator information, with nothing about Apache's
 review board.

 How to Contribute:  Review
 Process
 https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ReviewProcess
 
 


 See Phabricator
 https://cwiki.apache.org/confluence/display/Hive/PhabricatorCodeReview
 for
  instructions.
 
 - Use Hadoop's code review checklist
 http://wiki.apache.org/hadoop/CodeReviewChecklist as
 a rough guide when doing reviews.
 
 
 - In JIRA, use 'Submit Patch' to get your review request into the
 queue.
 
 
 - If a committer requests changes, set the issue status to 'Resume
 Progress', then once you're ready, submit an updated patch with
 necessary
 fixes and then request another round of review with 'Submit Patch'
 again.
 
 
 - Once your patch is accepted, be sure to upload a final version which
 grants rights to the ASF.
 
 
 Would someone please update this section with the appropriate link to
 review board instructions?  I'm a review board newbie (or wanna-be) but
 can't even get registration to work so I won't volunteer.  Should the link
 go to http://www.reviewboard.org/docs/manual/1.7/?


 -- Lefty


 On Sat, Oct 19, 2013 at 12:10 PM, Prasad Mujumdar pras...@cloudera.com
 wrote:

  +1 (non-binding)
  Its good to use a common review tool and one that's has no third party
  dependency.
 
  thanks
  Prasad
 
 
 
 
  On Fri, Oct 18, 2013 at 1:59 PM, Ashutosh Chauhan hashut...@apache.org
  wrote:
 
   0
  
   IMO phabricator interface is better than review board, but threat of
  losing
   comments and patches is also real.
   Actually, we already lost in few cases, ironically it was RB. Try to
 read
   the very first review request posted on HIVE-1634
  
   Ashutosh
  
  
   On Thu, Oct 17, 2013 at 6:55 PM, Yin Huai huaiyin@gmail.com
 wrote:
  
+1
   
   
On Thu, Oct 17, 2013 at 5:51 PM, Gunther Hagleitner 
ghagleit...@hortonworks.com wrote:
   
 +1

 Thanks,
 Gunther.


 On Thu, Oct 17, 2013 at 2:18 PM, Owen O'Malley omal...@apache.org
 
wrote:

  Ed,
I didn't remember being unable to see revisions without a
 login.
   That
 is
  uncool. I'll change my vote to +1.
 
  -- Owen
 
 
  On Wed, Oct 16, 2013 at 9:08 PM, Edward Capriolo 
edlinuxg...@gmail.com
  wrote:
 
   Owen,
   In your issues:
   https://issues.apache.org/jira/browse/HIVE-5567
  
   When I click this link:
   REVISION DETAIL
   https://reviews.facebook.net/D13479
  
   I am prompted for a password.
  
  
  
   On Wed, Oct 16, 2013 at 11:16 PM, Owen O'Malley 
 owen.omal...@gmail.com
   wrote:
  
-0
   
I like phabricator, but it is a pain to setup. It doesn't
   require a
 fb
account, but clearly it isn't managed or supported by Apache.
   
-- Owen
   
 On Oct 16, 2013, at 17:32, Edward Capriolo 
edlinuxg...@gmail.com
wrote:

 Our wiki has instructions for posting to phabricator for
 code
  reviews.

 
   https://cwiki.apache.org/confluence/display/Hive/PhabricatorCodeReview

 Phabricator now requires an external facebook account to
  review
   patches,
 and we have no technical support contact where phabricator
 is
 hosted.
   It
 also seems like some of the phabricator features are no
  longer
  working.

 Apache has a review board system many people are already
  using.

   https://reviews.apache.org/account/login/?next_page=/dashboard/

 This vote is to remove the phabricator instructions from
 the
wiki.
  The
 instructions will reference review board and that will be
 the
only
   system
 that Hive supports for patch review process.

 +1 is a vote for removing the phabricator instructions from
  the
 wiki.

 Thank you,
 Edward
   
  
 

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
   entity
to
 which it is addressed and may contain information that is
  confidential,
 privileged and exempt from disclosure under applicable law. If the
   reader
 of this message is not the intended recipient, you are hereby
  notified
that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
immediately
 and delete it from your system. Thank You.

   
  
 




-- 
Apache MRUnit - Unit testing MapReduce - 

[jira] [Updated] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-01-24 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6226:
-

Status: Open  (was: Patch Available)

 It should be possible to get hadoop, hive, and pig version being used by 
 WebHCat
 

 Key: HIVE-6226
 URL: https://issues.apache.org/jira/browse/HIVE-6226
 Project: Hive
  Issue Type: New Feature
  Components: WebHCat
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6226.patch


 Calling /version on WebHCat tells the caller the protocol verison, but there 
 is no way to determine the versions of software being run by the applications 
 that WebHCat spawns.  
 I propose to add an end-point: /version/\{module\} where module could be pig, 
 hive, or hadoop.  The response will then be:
 {code}
 {
   module : _module_name_,
   version : _version_string_
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-01-24 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Status: Patch Available  (was: Open)

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-5843-src-only.patch, HIVE-5843.2.patch, 
 HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.4-src.patch, 
 HIVE-5843.4.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-01-24 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6226:
-

Status: Patch Available  (was: Open)

 It should be possible to get hadoop, hive, and pig version being used by 
 WebHCat
 

 Key: HIVE-6226
 URL: https://issues.apache.org/jira/browse/HIVE-6226
 Project: Hive
  Issue Type: New Feature
  Components: WebHCat
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6226.2.patch, HIVE-6226.patch


 Calling /version on WebHCat tells the caller the protocol verison, but there 
 is no way to determine the versions of software being run by the applications 
 that WebHCat spawns.  
 I propose to add an end-point: /version/\{module\} where module could be pig, 
 hive, or hadoop.  The response will then be:
 {code}
 {
   module : _module_name_,
   version : _version_string_
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-01-24 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6226:
-

Attachment: HIVE-6226.2.patch

New version of the patch that has three separate URLs, per Eugene's feedback.

I don't think the .q test that failed on the last run of the build bot is 
related, as I didn't make any changes anywhere close to that code.

 It should be possible to get hadoop, hive, and pig version being used by 
 WebHCat
 

 Key: HIVE-6226
 URL: https://issues.apache.org/jira/browse/HIVE-6226
 Project: Hive
  Issue Type: New Feature
  Components: WebHCat
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6226.2.patch, HIVE-6226.patch


 Calling /version on WebHCat tells the caller the protocol verison, but there 
 is no way to determine the versions of software being run by the applications 
 that WebHCat spawns.  
 I propose to add an end-point: /version/\{module\} where module could be pig, 
 hive, or hadoop.  The response will then be:
 {code}
 {
   module : _module_name_,
   version : _version_string_
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6304) Update HCatReader/Writer docs to reflect recent changes

2014-01-24 Thread Alan Gates (JIRA)
Alan Gates created HIVE-6304:


 Summary: Update HCatReader/Writer docs to reflect recent changes
 Key: HIVE-6304
 URL: https://issues.apache.org/jira/browse/HIVE-6304
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0


HIVE-6248 made changes to the HCatReader and HCatWriter classes.  Those changes 
need to be reflect in the [HCatReader/Writer 
docs|https://cwiki.apache.org/confluence/display/Hive/HCatalog+ReaderWriter]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6248) HCatReader/Writer should hide Hadoop and Hive classes

2014-01-24 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6248:
-

  Resolution: Fixed
Release Note: HCatReader and HCatWriter API changed.  See 
https://cwiki.apache.org/confluence/display/Hive/HCatalog+ReaderWriter for 
details.
  Status: Resolved  (was: Patch Available)

Patch committed.

Thanks Ashutosh for the review.  I agree we need to update the docs.  Filed 
HIVE-6304 to track those changes.

 HCatReader/Writer should hide Hadoop and Hive classes
 -

 Key: HIVE-6248
 URL: https://issues.apache.org/jira/browse/HIVE-6248
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6248.patch


 HCat's HCatReader and HCatWriter interfaces expose Hadoop classes 
 Configuration and InputSplit, as well as HCatInputSplit.  This exposes users 
 to changes over Hadoop or HCatalog versions.  It also makes it harder to some 
 day move this interface to use WebHCat, which we'd like to do.  The eventual 
 goal is for this interface to not require any other jars (no Hadoop, Hive, 
 etc.)  As a first step to this the references to Hadoop and HCat classes in 
 the interface should be hidden.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881418#comment-13881418
 ] 

Hive QA commented on HIVE-5783:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12625077/HIVE-5783.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 4981 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1003/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1003/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12625077

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881423#comment-13881423
 ] 

Brock Noland commented on HIVE-5728:


This patch is now causing everyones precommit tests to show failing tests. 
Agreed with Navis, let's make sure that all commits have precommit tests.

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, 
 HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, 
 HIVE-5728-8.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881424#comment-13881424
 ] 

Brock Noland commented on HIVE-5783:


Those tests failed due to HIVE-5728 (which was committed without testing) and 
will be fixed via HIVE-6302.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6302) annotate_stats_*.q are failing on trunk

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881425#comment-13881425
 ] 

Brock Noland commented on HIVE-6302:


FYI apache jenkins is haing trouble so I kicked off the build for this manually.

 annotate_stats_*.q are failing on trunk
 ---

 Key: HIVE-6302
 URL: https://issues.apache.org/jira/browse/HIVE-6302
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6302.1.patch.txt


 I'm checking it out



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2014-01-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881435#comment-13881435
 ] 

Thejas M Nair commented on HIVE-6013:
-

Harish,
The doc seems to suggest that quoted identfiers are supported only for column 
names. But it seems to work when I try it with user name in grant statement. Is 
that not expected to work ? - eg - 
{code}
 grant all on x to user `user-qa`; 
show grant user `user-qa` on table x; 
{code}


 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6293) Not all minimr tests are executed or reported in precommit test run

2014-01-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881444#comment-13881444
 ] 

Thejas M Nair commented on HIVE-6293:
-

+1 for the file move.


 Not all minimr tests are executed or reported in precommit test run
 ---

 Key: HIVE-6293
 URL: https://issues.apache.org/jira/browse/HIVE-6293
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that not all q file tests for minimr are executed or reported in the 
 pre-commit test run. Here is an example:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/
 This might be due to ptest because manually running test TestMinimrCliDriver 
 seems executing all tests. My last run shows 38 tests run, with 8 test 
 failures.
 This is identified in HIVE-5446. It needs to be fixed to have broader 
 coverage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic

2014-01-24 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6257:
--

Attachment: hive-6257.03.patch

Finished random tests for high-precision decimal add, subtract, multiply, and 
divide on Decimal128.

 Add more unit tests for high-precision Decimal128 arithmetic
 

 Key: HIVE-6257
 URL: https://issues.apache.org/jira/browse/HIVE-6257
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
Priority: Minor
 Attachments: HIVE-6257.02.patch, hive-6257.03.patch


 Add more unit tests for high-precision Decimal128 arithmetic, with arguments 
 close to or at 38 digit limit. Consider some random stress tests for broader 
 coverage. Coverage is pretty good now (after HIVE-6243) for precision up to 
 about 18. This is to go beyond that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic

2014-01-24 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6257:
--

Attachment: (was: hive-6257.03.patch)

 Add more unit tests for high-precision Decimal128 arithmetic
 

 Key: HIVE-6257
 URL: https://issues.apache.org/jira/browse/HIVE-6257
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
Priority: Minor
 Attachments: HIVE-6257.02.patch, HIVE-6257.03.patch


 Add more unit tests for high-precision Decimal128 arithmetic, with arguments 
 close to or at 38 digit limit. Consider some random stress tests for broader 
 coverage. Coverage is pretty good now (after HIVE-6243) for precision up to 
 about 18. This is to go beyond that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2014-01-24 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881467#comment-13881467
 ] 

Harish Butani commented on HIVE-6013:
-

At the language level any identifier can be quoted. The change made was at the 
Lexer level.
Special characters is probably ok in Usernames. I didn't want to make this 
assertion because there maybe code in the metadata layer that doesn't like 
special characters. For e.g we know for tableNames this is an issue. 
If you don't anticipate an issue, we can say that special characters are 
supported for Usernames. Hopefully this can be extended to role/privilege names 
also.

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic

2014-01-24 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6257:
--

Attachment: HIVE-6257.03.patch

fixed name capitalization

 Add more unit tests for high-precision Decimal128 arithmetic
 

 Key: HIVE-6257
 URL: https://issues.apache.org/jira/browse/HIVE-6257
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
Priority: Minor
 Attachments: HIVE-6257.02.patch, HIVE-6257.03.patch


 Add more unit tests for high-precision Decimal128 arithmetic, with arguments 
 close to or at 38 digit limit. Consider some random stress tests for broader 
 coverage. Coverage is pretty good now (after HIVE-6243) for precision up to 
 about 18. This is to go beyond that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6305) test use of quoted identifiers in user/role names

2014-01-24 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6305:
---

 Summary: test use of quoted identifiers in user/role names
 Key: HIVE-6305
 URL: https://issues.apache.org/jira/browse/HIVE-6305
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Thejas M Nair


Tests need to be added to verify that quoted identifiers can be used with user 
and role names.

For example - 
{code}
 grant all on x to user `user-qa`; 
show grant user `user-qa` on table x; 
{code}




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2014-01-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881486#comment-13881486
 ] 

Thejas M Nair commented on HIVE-6013:
-

I have created a jira to test it out with users and role names - HIVE-6305 . I 
think it should work fine.
[~leftylev], is the documentation of this jira already part of any wiki page ? 
I had trouble finding it.



 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17005: Vectorized reader for DECIMAL datatype for ORC format.

2014-01-24 Thread Jitendra Pandey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17005/
---

(Updated Jan. 24, 2014, 10:28 p.m.)


Review request for hive and Eric Hanson.


Bugs: HIVE-6178
https://issues.apache.org/jira/browse/HIVE-6178


Repository: hive-git


Description
---

vectorized reader for DECIMAL datatype for ORC format.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 3939511 
  common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
d71ebb3 
  common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
fbb2aa0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
23564bb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 0df82b9 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedORCReader.java 
0d5b7ff 

Diff: https://reviews.apache.org/r/17005/diff/


Testing
---


Thanks,

Jitendra Pandey



[jira] [Updated] (HIVE-6178) Implement vectorized reader for DECIMAL datatype for ORC format.

2014-01-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6178:
---

Attachment: HIVE-6178.2.patch

 Implement vectorized reader for DECIMAL datatype for ORC format.
 

 Key: HIVE-6178
 URL: https://issues.apache.org/jira/browse/HIVE-6178
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6178.1.patch, HIVE-6178.2.patch


 Implement vectorized reader for DECIMAL datatype for ORC format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6178) Implement vectorized reader for DECIMAL datatype for ORC format.

2014-01-24 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881492#comment-13881492
 ] 

Jitendra Nath Pandey commented on HIVE-6178:


Uploaded a new patch addressing a few comments. I have also posted an 
explanation for handling variable scales.

 Implement vectorized reader for DECIMAL datatype for ORC format.
 

 Key: HIVE-6178
 URL: https://issues.apache.org/jira/browse/HIVE-6178
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6178.1.patch, HIVE-6178.2.patch


 Implement vectorized reader for DECIMAL datatype for ORC format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6178) Implement vectorized reader for DECIMAL datatype for ORC format.

2014-01-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6178:
---

Status: Open  (was: Patch Available)

 Implement vectorized reader for DECIMAL datatype for ORC format.
 

 Key: HIVE-6178
 URL: https://issues.apache.org/jira/browse/HIVE-6178
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6178.1.patch, HIVE-6178.2.patch


 Implement vectorized reader for DECIMAL datatype for ORC format.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2014-01-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881495#comment-13881495
 ] 

Lefty Leverenz commented on HIVE-6013:
--

Not in the wiki yet.  I'll bump its priority to the top.

 Supporting Quoted Identifiers in Column Names
 -

 Key: HIVE-6013
 URL: https://issues.apache.org/jira/browse/HIVE-6013
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
 HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
 QuotedIdentifier.html


 Hive's current behavior on Quoted Identifiers is different from the normal 
 interpretation. Quoted Identifier (using backticks) has a special 
 interpretation for Select expressions(as Regular Expressions). Have 
 documented current behavior and proposed a solution in attached doc.
 Summary of solution is:
 - Introduce 'standard' quoted identifiers for columns only. 
 - At the langauage level this is turned on by a flag.
 - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6243) error in high-precision division for Decimal128

2014-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881502#comment-13881502
 ] 

Hive QA commented on HIVE-6243:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624837/HIVE-6243.02.patch

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 4952 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hive.hcatalog.hbase.TestHiveHBaseStorageHandler.testTableCreateDrop
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1006/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1006/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624837

 error in high-precision division for Decimal128
 ---

 Key: HIVE-6243
 URL: https://issues.apache.org/jira/browse/HIVE-6243
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6243.01.patch, HIVE-6243.02.patch, 
 divide-error.01.patch


 a = 213474114411690
 b = 5062120663
 a * b = 1080631725579042037750470
 (a * b) / b == 
   actual:   251599050984618
   expected: 213474114411690



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat

2014-01-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881521#comment-13881521
 ] 

Eugene Koifman commented on HIVE-6226:
--

+1

 It should be possible to get hadoop, hive, and pig version being used by 
 WebHCat
 

 Key: HIVE-6226
 URL: https://issues.apache.org/jira/browse/HIVE-6226
 Project: Hive
  Issue Type: New Feature
  Components: WebHCat
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: HIVE-6226.2.patch, HIVE-6226.patch


 Calling /version on WebHCat tells the caller the protocol verison, but there 
 is no way to determine the versions of software being run by the applications 
 that WebHCat spawns.  
 I propose to add an end-point: /version/\{module\} where module could be pig, 
 hive, or hadoop.  The response will then be:
 {code}
 {
   module : _module_name_,
   version : _version_string_
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-4764) support the authentication modes for thrift over http transport for HS2

2014-01-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-4764:
---

Description: 
This subtask covers support for following functionality for thrift over http 
transport in hive server2 
- Support for LDAP,kerberos, custom authorization modes

  was:
This subtask covers support for following functionality for thrift over http 
transport in hive server2 
- Support for LDAP,kerberos, custom authorization modes
- Support for doAs functionality.


 support the authentication modes for thrift over http transport for HS2
 ---

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 This subtask covers support for following functionality for thrift over http 
 transport in hive server2 
 - Support for LDAP,kerberos, custom authorization modes



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6306) HiveServer2 running in http mode should support for doAs functionality

2014-01-24 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6306:
--

 Summary: HiveServer2 running in http mode should support for doAs 
functionality
 Key: HIVE-6306
 URL: https://issues.apache.org/jira/browse/HIVE-6306
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


Currently http mode does not support doAs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6306) HiveServer2 running in http mode should support for doAs functionality

2014-01-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6306:
---

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-4752

 HiveServer2 running in http mode should support for doAs functionality
 --

 Key: HIVE-6306
 URL: https://issues.apache.org/jira/browse/HIVE-6306
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 Currently http mode does not support doAs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-4026) Add HTTP support to HiveServer2

2014-01-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-4026.


Resolution: Duplicate

Duplicate of HIVE-4752

 Add HTTP support to HiveServer2
 ---

 Key: HIVE-4026
 URL: https://issues.apache.org/jira/browse/HIVE-4026
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Server Infrastructure
Reporter: Mike Liddell
Assignee: Mike Liddell
 Attachments: HIVE-4026.patch


 Add HTTP as endpoint option for HiveServer2.  This supports environments for 
 which TCP connectivity is inconvenient or impossible.  One key scenario is 
 beeline connecting to a HTTPS proxy/gateway which forwards to HS2-HTTP.
 Due to the proxy/gateway scenario being most secure, support for HS2 HTTPS 
 has not been added.
 new behavior:
   new configuration options to use HTTP server mode rather than TCP
   http mode uses Jetty server/servlets
   new beeline client URI parsing and HTTP transport behavior.
 Usage:
 (1) TCP-mode:  beeline !connect jdbc:hive2://server:port/ user password
 (2) HTTP-mode: beeline !connect jdbc:hive2:http://server:port/path/../ 
 user password
 (3) via HTTPS proxy: beeline !connect 
 jdbc:hive2:https://server:port/path/../ user password



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HIVE-4752) Add support for hs2 api to use thrift over http

2014-01-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-4752:
--

Assignee: Vaibhav Gumashta  (was: Thejas M Nair)

 Add support for hs2 api to use thrift over http
 ---

 Key: HIVE-4752
 URL: https://issues.apache.org/jira/browse/HIVE-4752
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 Hiveserver2 acts as service on the cluster for external applications. One way 
 to implement access control to services on a hadoop cluster to have a gateway 
 server authorizes service requests before forwarding them to the server. The 
 [knox project | http://wiki.apache.org/incubator/knox] has taken this 
 approach to simplify cluster security management.
 Other services on hadoop cluster such as webhdfs and webhcat already use 
 HTTP. Having hiveserver2 also support thrift over http transport will enable 
 securing hiveserver2 as well using the same approach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-24 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6234:
--

Attachment: Vectorized Text InputFormat design.pdf
Vectorized Text InputFormat design.docx

Attaching version 01 of design specification for this feature.

 Implement fast vectorized InputFormat extension for text files
 --

 Key: HIVE-6234
 URL: https://issues.apache.org/jira/browse/HIVE-6234
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: Vectorized Text InputFormat design.docx, Vectorized Text 
 InputFormat design.pdf


 Implement support for vectorized scan input of text files (plain text with 
 configurable record and field separators). This should work for CSV files, 
 tab delimited files, etc. 
 The goal is to provide high-performance reading of these files using 
 vectorized scans, and also to do it as an extension of existing Hive. Then, 
 if vectorized query is enabled, existing tables based on text files will be 
 able to benefit immediately without the need to use a different input format. 
 After upgrading to new Hive bits that support this, faster, vectorized 
 processing over existing text tables should just work, when vectorization is 
 enabled.
 Another goal is to go beyond a simple layering of vectorized row batch 
 iterator over the top of the existing row iterator. It should be possible to, 
 say, read a chunk of data into a byte buffer (several thousand or even 
 million rows), and then read data from it into vectorized row batches 
 directly. Object creations should be minimized to save allocation time and GC 
 overhead. If it is possible to save CPU for values like dates and numbers by 
 caching the translation from string to the final data type, that should 
 ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881554#comment-13881554
 ] 

Hive QA commented on HIVE-6263:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624934/HIVE-6263.3.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 4949 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1007/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1007/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624934

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch, HIVE-6263.3.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6307) completed field description should be clarified.

2014-01-24 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-6307:


 Summary: completed field description should be clarified.
 Key: HIVE-6307
 URL: https://issues.apache.org/jira/browse/HIVE-6307
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman


https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Job explains 
the fields in the JSON document which contains status information for a 
particular job.

completed field is set once the process that the Launcher task launched 
returns.  For example, if user submitted a M/R job via webhcat, completed 
will be set to done once the hadoop jar command that the Launcher invokes 
exits.  If one is looking for status of the job itself, the fields inside 
status element should be consulted (e.g. jobComplete or runState)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6307) completed field description should be clarified.

2014-01-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6307:
-

Description: 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Job explains 
the fields in the JSON document which contains status information for a 
particular job.

completed field is set once the process that the Launcher task launched 
returns.  For example, if user submitted a M/R job via webhcat, completed 
will be set to done once the hadoop jar command that the Launcher invokes 
exits.  If one is looking for status of the job itself, the fields inside 
status element should be consulted (e.g. jobComplete or runState).

Current doc is not clear and may mislead WebHCat user into thinking completed 
is a property of the job itself.

  was:
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Job explains 
the fields in the JSON document which contains status information for a 
particular job.

completed field is set once the process that the Launcher task launched 
returns.  For example, if user submitted a M/R job via webhcat, completed 
will be set to done once the hadoop jar command that the Launcher invokes 
exits.  If one is looking for status of the job itself, the fields inside 
status element should be consulted (e.g. jobComplete or runState)


 completed field description should be clarified.
 --

 Key: HIVE-6307
 URL: https://issues.apache.org/jira/browse/HIVE-6307
 Project: Hive
  Issue Type: Bug
  Components: Documentation, WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman

 https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Job 
 explains the fields in the JSON document which contains status information 
 for a particular job.
 completed field is set once the process that the Launcher task launched 
 returns.  For example, if user submitted a M/R job via webhcat, completed 
 will be set to done once the hadoop jar command that the Launcher invokes 
 exits.  If one is looking for status of the job itself, the fields inside 
 status element should be consulted (e.g. jobComplete or runState).
 Current doc is not clear and may mislead WebHCat user into thinking 
 completed is a property of the job itself.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6293) Not all minimr tests are executed or reported in precommit test run

2014-01-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881598#comment-13881598
 ] 

Xuefu Zhang commented on HIVE-6293:
---

Even if we copy/move the miniMR tests to a different directory, we still need 
to modify ptest so that it knows where to pick it up, right? What sort of 
change is required?

I have temporarily modified the test property file to be consistent with pom 
file w.r.t miniMR test. Let's see how many of them are going to fail.

 Not all minimr tests are executed or reported in precommit test run
 ---

 Key: HIVE-6293
 URL: https://issues.apache.org/jira/browse/HIVE-6293
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that not all q file tests for minimr are executed or reported in the 
 pre-commit test run. Here is an example:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/
 This might be due to ptest because manually running test TestMinimrCliDriver 
 seems executing all tests. My last run shows 38 tests run, with 8 test 
 failures.
 This is identified in HIVE-5446. It needs to be fixed to have broader 
 coverage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17005: Vectorized reader for DECIMAL datatype for ORC format.

2014-01-24 Thread Jitendra Pandey


 On Jan. 20, 2014, 6:56 p.m., Eric Hanson wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line 
  1119
  https://reviews.apache.org/r/17005/diff/1/?file=425358#file425358line1119
 
  It seems odd that we're reading from a scaleStream because the scale 
  should be the same for every value in the column. Is this necessary?
  
 

  The orc decimal encoding currently supports arbitrary scale. Although, hive 
doesn't allow variable scales, the orc format allows it. We should have another 
decimal encoding in hive optimized for specific precision and scale, and 
correspondingly we will have to add additional vectorized reader as well for 
decimal. 
  Since the reader is part of ORC code, I think it should also allow reading 
variable scales as per the encoding. If that doesn't match the scale in the 
schema, then we definitely have a data/schema corruption issue.


 On Jan. 20, 2014, 6:56 p.m., Eric Hanson wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java, line 
  1123
  https://reviews.apache.org/r/17005/diff/1/?file=425358#file425358line1123
 
  If any scale values are different inside a single DecimalColumnVector, 
  I think that could cause unpredictable or wrong results. 
  
  Later operations on DecimalColumnVector take the scale from the 
  columnvector sometimes, not each individual object.

If the scale in the data is different from the scale assumed in the vectorized 
reader, we would still have erroneous results. 


- Jitendra


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17005/#review32299
---


On Jan. 24, 2014, 10:28 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17005/
 ---
 
 (Updated Jan. 24, 2014, 10:28 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6178
 https://issues.apache.org/jira/browse/HIVE-6178
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 vectorized reader for DECIMAL datatype for ORC format.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 3939511 
   common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
 d71ebb3 
   common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
 fbb2aa0 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
 23564bb 
   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 0df82b9 
   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedORCReader.java 
 0d5b7ff 
 
 Diff: https://reviews.apache.org/r/17005/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6302) annotate_stats_*.q are failing on trunk

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881611#comment-13881611
 ] 

Gunther Hagleitner commented on HIVE-6302:
--

[~navis] [~owen.omalley] wanted to keep HiveConf out of ORC to keep the 
dependencies to hadoop core. (That way you can use ORC outside hive, eg pig). 
It'd be good to have him weigh in at least. 

I think we should revert HIVE-5728 until we have a fix so that the trunk is 
healthy again.

 annotate_stats_*.q are failing on trunk
 ---

 Key: HIVE-6302
 URL: https://issues.apache.org/jira/browse/HIVE-6302
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6302.1.patch.txt


 I'm checking it out



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-01-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881624#comment-13881624
 ] 

Carl Steinbach commented on HIVE-5783:
--

I noticed that this SerDe doesn't support several of Hive's types: binary, 
timestamp, date, and probably a couple others as well. If there other known 
limitations it would be helpful to list them.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Reopened] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-01-24 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reopened HIVE-5728:
-


 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, 
 HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, 
 HIVE-5728-8.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-01-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881628#comment-13881628
 ] 

Thejas M Nair commented on HIVE-5728:
-

Reverted this patch and re-opened the jira, as we need a different fix than one 
in HIVE-6302 .


 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, 
 HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, 
 HIVE-5728-8.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6308) COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.

2014-01-24 Thread Alexander Behm (JIRA)
Alexander Behm created HIVE-6308:


 Summary: COLUMNS_V2 Metastore table not populated for tables 
created without an explicit column list.
 Key: HIVE-6308
 URL: https://issues.apache.org/jira/browse/HIVE-6308
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.10.0
Reporter: Alexander Behm


Consider this example table:

CREATE TABLE avro_test
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc');

When I try to run an ANALYZE TABLE for computing column stats on any of the 
columns, then I get:

org.apache.hadoop.hive.ql.metadata.HiveException: 
NoSuchObjectException(message:Column o_orderpriority for which stats gathering 
is requested doesn't exist.)
at 
org.apache.hadoop.hive.ql.metadata.Hive.updateTableColumnStatistics(Hive.java:2280)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:331)
at 
org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

The root cause appears to be that the COLUMNS_V2 table in the Metastore isn't 
populated properly during the table creation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5181) RetryingRawStore should not retry on logical failures (e.g. from commit)

2014-01-24 Thread Jayesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881629#comment-13881629
 ] 

Jayesh commented on HIVE-5181:
--

just wanted to report my finding...as I see this is been committed to hive 0.13

I didnt get this issue resolved by the patch provided here for the same issue I 
am having in hive-0.12
looks like the problem lies somewhere the way DB pool dealing with transaction, 
and switching to DBCP (HIVE-4996.patch) fixed and in some way confirmed it.

Thanks
Jay

 RetryingRawStore should not retry on logical failures (e.g. from commit)
 

 Key: HIVE-5181
 URL: https://issues.apache.org/jira/browse/HIVE-5181
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Prasad Mujumdar
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5181.1.patch, HIVE-5181.3.patch


 RetryingRawStore retries calls. Some method (e.g. drop_table_core in 
 HiveMetaStore) explicitly call openTransaction and commitTransaction on 
 RawStore.
 When the commit call fails due to some real issue, it is retried, and instead 
 of a real cause for failure one gets some bogus exception about transaction 
 open count.
 I doesn't make sense to retry logical errors, especially not from 
 commitTransaction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6157) Fetching column stats slower than the 101 during rush hour

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881630#comment-13881630
 ] 

Gunther Hagleitner commented on HIVE-6157:
--

[~prasanth_j] do you want to also take a look? This would effect the stats 
annotation too.

 Fetching column stats slower than the 101 during rush hour
 --

 Key: HIVE-6157
 URL: https://issues.apache.org/jira/browse/HIVE-6157
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Sergey Shelukhin
 Attachments: HIVE-6157.01.patch, HIVE-6157.01.patch, 
 HIVE-6157.03.patch, HIVE-6157.03.patch, HIVE-6157.nogen.patch, 
 HIVE-6157.nogen.patch, HIVE-6157.prelim.patch


 hive.stats.fetch.column.stats controls whether the column stats for a table 
 are fetched during explain (in Tez: during query planning). On my setup (1 
 table 4000 partitions, 24 columns) the time spent in semantic analyze goes 
 from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent 
 fetching column stats...
 The reason is probably that the APIs force you to make separate metastore 
 calls for each column in each partition. That's probably the first thing that 
 has to change. The question is if in addition to that we need to cache this 
 in the client or store the stats as a single blob in the database to further 
 cut down on the time. However, the way it stands right now column stats seem 
 unusable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction

2014-01-24 Thread Jayesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881631#comment-13881631
 ] 

Jayesh commented on HIVE-4996:
--

just want to add my experience and testing with this bug,
- looks like BoneCP has bug.
- switching to DBCP resolved this issue for me.


 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Priority: Critical
  Labels: hive, metastore
 Attachments: hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6293) Not all minimr tests are executed or reported in precommit test run

2014-01-24 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881632#comment-13881632
 ] 

Brock Noland commented on HIVE-6293:


Ptest2 has configurable directories so no change would be required. 

 Not all minimr tests are executed or reported in precommit test run
 ---

 Key: HIVE-6293
 URL: https://issues.apache.org/jira/browse/HIVE-6293
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that not all q file tests for minimr are executed or reported in the 
 pre-commit test run. Here is an example:
 http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/987/testReport/org.apache.hadoop.hive.cli/TestMinimrCliDriver/
 This might be due to ptest because manually running test TestMinimrCliDriver 
 seems executing all tests. My last run shows 38 tests run, with 8 test 
 failures.
 This is identified in HIVE-5446. It needs to be fixed to have broader 
 coverage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17002: alter table partition column throws NPE in authorization

2014-01-24 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17002/#review32774
---

Ship it!


Ship It!

- Thejas Nair


On Jan. 24, 2014, 1:02 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17002/
 ---
 
 (Updated Jan. 24, 2014, 1:02 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6205
 https://issues.apache.org/jira/browse/HIVE-6205
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 alter table alter_coltype partition column (dt int);
 {noformat}
 2014-01-15 15:53:40,364 ERROR ql.Driver (SessionState.java:printError(457)) - 
 FAILED: NullPointerException null
 java.lang.NullPointerException
   at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:599)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:479)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:996)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1039)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
 {noformat}
 
 Operation for TOK_ALTERTABLE_ALTERPARTS is not defined.
 
 
 Diffs
 -
 
   
 hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
  1d4a9a1 
   
 hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
  97973db 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 5af1ec6 
   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 0e2d555 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g c15c4b5 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
 835a654 
   ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java fe88a50 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveOperationType.java
  e20b183 
   ql/src/test/results/clientnegative/alter_partition_coltype_2columns.q.out 
 e1f9a27 
   ql/src/test/results/clientpositive/alter_partition_coltype.q.out 685bf88 
 
 Diff: https://reviews.apache.org/r/17002/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Commented] (HIVE-6205) alter table partition column throws NPE in authorization

2014-01-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881644#comment-13881644
 ] 

Thejas M Nair commented on HIVE-6205:
-

+1

 alter table partition column throws NPE in authorization
 --

 Key: HIVE-6205
 URL: https://issues.apache.org/jira/browse/HIVE-6205
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-6205.1.patch.txt, HIVE-6205.2.patch.txt, 
 HIVE-6205.3.patch.txt, HIVE-6205.4.patch.txt, HIVE-6205.5.patch.txt


 alter table alter_coltype partition column (dt int);
 {noformat}
 2014-01-15 15:53:40,364 ERROR ql.Driver (SessionState.java:printError(457)) - 
 FAILED: NullPointerException null
 java.lang.NullPointerException
   at org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:599)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:479)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:996)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1039)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:932)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:922)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
 {noformat}
 Operation for TOK_ALTERTABLE_ALTERPARTS is not defined.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6298) Add config flag to turn off fetching partition stats

2014-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881651#comment-13881651
 ] 

Hive QA commented on HIVE-6298:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624956/HIVE-6298.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 4963 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_import_exported_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_load_hdfs_file_with_space_in_the_name
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_file_with_header_footer_negative
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1008/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1008/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624956

 Add config flag to turn off fetching partition stats
 

 Key: HIVE-6298
 URL: https://issues.apache.org/jira/browse/HIVE-6298
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6298.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6261) Update metadata.q.out file for tez (after change to .q file)

2014-01-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6261:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Update metadata.q.out file for tez (after change to .q file)
 

 Key: HIVE-6261
 URL: https://issues.apache.org/jira/browse/HIVE-6261
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6261.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6260) Compress plan when sending via RPC (Tez)

2014-01-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6260:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the review Vikram!

 Compress plan when sending via RPC (Tez)
 

 Key: HIVE-6260
 URL: https://issues.apache.org/jira/browse/HIVE-6260
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6260.1.patch


 When trying to send plan via RPC it's helpful to compress the payload. That 
 way more potential plans can be sent (size limit).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881659#comment-13881659
 ] 

Gunther Hagleitner commented on HIVE-6263:
--

Test failures are unrelated.

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch, HIVE-6263.3.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-24 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881664#comment-13881664
 ] 

Gunther Hagleitner commented on HIVE-6263:
--

Committed to trunk. Thanks for the review Vikram!

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch, HIVE-6263.3.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6263) Avoid sending input files multiple times on Tez

2014-01-24 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6263:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Avoid sending input files multiple times on Tez
 ---

 Key: HIVE-6263
 URL: https://issues.apache.org/jira/browse/HIVE-6263
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6263.1.patch, HIVE-6263.2.patch, HIVE-6263.3.patch


 Input paths can be recontructed from the plan. No need to send them in the 
 job conf as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6183) Implement vectorized type cast from/to decimal(p, s)

2014-01-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881673#comment-13881673
 ] 

Hive QA commented on HIVE-6183:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12624946/HIVE-6183.10.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 4969 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_import_exported_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_load_hdfs_file_with_space_in_the_name
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_file_with_header_footer_negative
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1009/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1009/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12624946

 Implement vectorized type cast from/to decimal(p, s)
 

 Key: HIVE-6183
 URL: https://issues.apache.org/jira/browse/HIVE-6183
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6183.07.patch, HIVE-6183.08.patch, 
 HIVE-6183.09.patch, HIVE-6183.09.patch, HIVE-6183.10.patch


 Add support for all the type supported type casts to/from decimal(p,s) in 
 vectorized mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)