[jira] [Commented] (HIVE-686) add UDF substring_index

2014-02-25 Thread CHEN GEN (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911364#comment-13911364
 ] 

CHEN GEN commented on HIVE-686:
---

BUG: this funtion is not support 
substring_index(www.test.com,test,-1)=com
FIX Suggestion:
last line:
r.set(input.substring(k + delim.length()));

 add UDF substring_index
 ---

 Key: HIVE-686
 URL: https://issues.apache.org/jira/browse/HIVE-686
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Namit Jain
Assignee: Larry Ogrodnek
 Attachments: HIVE-686.patch, HIVE-686.patch


 add UDFsubstring_index
 look at
 http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
 for details



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6500:
---

Attachment: HIVE-6500.patch

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911365#comment-13911365
 ] 

Ashutosh Chauhan commented on HIVE-6500:


In FS based stats collection, idea is each task will write stats it has 
collected in a file on FS, which than will be aggregated after job has finished.

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6500:
---

Attachment: HIVE-6500.patch

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6500:
---

Attachment: (was: HIVE-6500.patch)

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6500:
---

Status: Patch Available  (was: Open)

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 18459: FS based stats.

2014-02-25 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18459/
---

Review request for hive and Navis Ryu.


Bugs: HIVE-6500
https://issues.apache.org/jira/browse/HIVE-6500


Repository: hive


Description
---

FS based stats collection.


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
1571554 
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregator.java 
1571554 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregatorTez.java
 1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsPublisher.java 
1571554 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsCollectionTaskIndependent.java
 PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 1571554 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/statsfs.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/statsfs.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/18459/diff/


Testing
---

Added new tests.


Thanks,

Ashutosh Chauhan



[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs

2014-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911387#comment-13911387
 ] 

Lefty Leverenz commented on HIVE-6380:
--

Well done, Jason.  I tinkered with your wiki fixes by adding a few links.

 Specify jars/files when creating permanent UDFs
 ---

 Key: HIVE-6380
 URL: https://issues.apache.org/jira/browse/HIVE-6380
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.13.0

 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, 
 HIVE-6380.4.patch


 Need a way for a permanent UDF to reference jars/files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: (was: HIVE-5687.v2.patch)

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5687) Streaming support in Hive

2014-02-25 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-5687:
--

Attachment: HIVE-5687.v2.patch

updating patch v2 with minor tweaks

 Streaming support in Hive
 -

 Key: HIVE-5687
 URL: https://issues.apache.org/jira/browse/HIVE-5687
 Project: Hive
  Issue Type: Bug
Reporter: Roshan Naik
Assignee: Roshan Naik
 Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf, 
 5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch, 
 HIVE-5687.v2.patch


 Implement support for Streaming data into HIVE.
 - Provide a client streaming API 
 - Transaction support: Clients should be able to periodically commit a batch 
 of records atomically
 - Immediate visibility: Records should be immediately visible to queries on 
 commit
 - Should not overload HDFS with too many small files
 Use Cases:
  - Streaming logs into HIVE via Flume
  - Streaming results of computations from Storm



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5380) Non-default OI constructors should be supported for backwards compatibility

2014-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911395#comment-13911395
 ] 

Lefty Leverenz commented on HIVE-5380:
--

[~lars_francke] identified the source of the exclamation point prefixes in the 
Hive SerDe  Object Inspector sections of the Developer Guide:  old MoinMoin 
syntax.  (See his comment on the SerDes doc:  
https://cwiki.apache.org/confluence/display/Hive/SerDe?focusedCommentId=39620650#comment-39620650.)
  So I'm taking them out.

 Non-default OI constructors should be supported for backwards compatibility
 ---

 Key: HIVE-5380
 URL: https://issues.apache.org/jira/browse/HIVE-5380
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5380.patch, HIVE-5380.patch, HIVE-5380.patch


 In HIVE-5263 we started serializing OI's when cloning the plan. This was a 
 great boost in speed for many queries. In the future we'd like to stop 
 copying the OI's, perhaps in HIVE-4396.
 Until then Custom Serdes will not work on trunk. This is a fix to allow 
 custom serdes such as the Hive JSon Serde work until we address the fact we 
 don't want to have to copy the OI's. Since this is modifying the byte code, 
 we should recommend that the no-arg constructor be added.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-3938) Hive MetaStore should send a single AddPartitionEvent for atomically added partition-set.

2014-02-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3938:
---

Attachment: HIVE-3938.trunk.2.patch

I've rebased this patch for the latest trunk (0.13-ish).

I've had to remove the support for multi-table add-partitions, because the 
metastore now seems to check that all partitions in add_partitions_core() 
actually belong to the same table.

I've modified the TestNotificationListener accordingly.

 Hive MetaStore should send a single AddPartitionEvent for atomically added 
 partition-set.
 -

 Key: HIVE-3938
 URL: https://issues.apache.org/jira/browse/HIVE-3938
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3938.trunk.2.patch, 
 Hive-3938-Support_for_Multi-table-insert.patch


 HiveMetaStore::add_partitions() currently adds all partitions specified in 
 one call using a single meta-store transaction. This acts correctly. However, 
 there's one AddPartitionEvent created per partition specified.
 Ideally, the set of partitions added atomically can be communicated using a 
 single AddPartitionEvent, such that they are consumed together.
 I'll post a patch that does this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-3938) Hive MetaStore should send a single AddPartitionEvent for atomically added partition-set.

2014-02-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3938:
---

Status: Patch Available  (was: Open)

 Hive MetaStore should send a single AddPartitionEvent for atomically added 
 partition-set.
 -

 Key: HIVE-3938
 URL: https://issues.apache.org/jira/browse/HIVE-3938
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3938.trunk.2.patch, 
 Hive-3938-Support_for_Multi-table-insert.patch


 HiveMetaStore::add_partitions() currently adds all partitions specified in 
 one call using a single meta-store transaction. This acts correctly. However, 
 there's one AddPartitionEvent created per partition specified.
 Ideally, the set of partitions added atomically can be communicated using a 
 single AddPartitionEvent, such that they are consumed together.
 I'll post a patch that does this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns

2014-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911420#comment-13911420
 ] 

Hive QA commented on HIVE-6147:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630595/HIVE-6147.3.patch.txt

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 5186 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
org.apache.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt
org.apache.hcatalog.pig.TestHCatLoader.testGetInputBytes
org.apache.hcatalog.pig.TestHCatLoader.testProjectionsBasic
org.apache.hcatalog.pig.TestHCatLoader.testReadDataBasic
org.apache.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic
org.apache.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex
org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData
org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema
org.apache.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag
org.apache.hcatalog.pig.TestHCatStorer.testBagNStruct
org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt
org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes
org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic
org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic
org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex
org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testMapWithComplexData
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag
org.apache.hive.hcatalog.pig.TestHCatStorer.testBagNStruct
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1485/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1485/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630595

 Support avro data stored in HBase columns
 -

 Key: HIVE-6147
 URL: https://issues.apache.org/jira/browse/HIVE-6147
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, 
 HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt


 Presently, the HBase Hive integration supports querying only primitive data 
 types in columns. It would be nice to be able to store and query Avro objects 
 in HBase columns by making them visible as structs to Hive. This will allow 
 Hive to perform ad hoc analysis of HBase data which can be deeply structured.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6501) Change hadoop dependency on tez branch

2014-02-25 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6501:
-

Attachment: HIVE-6501.1.patch

 Change hadoop dependency on tez branch
 --

 Key: HIVE-6501
 URL: https://issues.apache.org/jira/browse/HIVE-6501
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch

 Attachments: HIVE-6501.1.patch


 Now that 2.3.0 is out, we no longer need to pull the snapshot.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6501) Change hadoop dependency on tez branch

2014-02-25 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-6501:


 Summary: Change hadoop dependency on tez branch
 Key: HIVE-6501
 URL: https://issues.apache.org/jira/browse/HIVE-6501
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: tez-branch
 Attachments: HIVE-6501.1.patch

Now that 2.3.0 is out, we no longer need to pull the snapshot.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2014-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911465#comment-13911465
 ] 

Hive QA commented on HIVE-6329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630599/HIVE-6329.7.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5181 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1486/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1486/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630599

 Support column level encryption/decryption
 --

 Key: HIVE-6329
 URL: https://issues.apache.org/jira/browse/HIVE-6329
 Project: Hive
  Issue Type: New Feature
  Components: Security, Serializers/Deserializers
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6329.1.patch.txt, HIVE-6329.2.patch.txt, 
 HIVE-6329.3.patch.txt, HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, 
 HIVE-6329.6.patch.txt, HIVE-6329.7.patch.txt


 Receiving some requirements on encryption recently but hive is not supporting 
 it. Before the full implementation via HIVE-5207, this might be useful for 
 some cases.
 {noformat}
 hive create table encode_test(id int, name STRING, phone STRING, address 
 STRING) 
  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
  WITH SERDEPROPERTIES ('column.encode.indices'='2,3', 
 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
 STORED AS TEXTFILE;
 OK
 Time taken: 0.584 seconds
 hive insert into table encode_test select 
 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
 ..
 OK
 Time taken: 5.121 seconds
 hive select * from encode_test;
 OK
 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
 Time taken: 0.078 seconds, Fetched: 1 row(s)
 hive 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18185: Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-02-25 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18185/
---

(Updated Feb. 25, 2014, 12:23 p.m.)


Review request for hive and Thejas Nair.


Changes
---

Review feedback + cleanup + simpler kerberos negotiation + kerberos doAs.


Bugs: HIVE-4764
https://issues.apache.org/jira/browse/HIVE-4764


Repository: hive-git


Description
---

Support Kerberos HTTP authentication for HiveServer2 running in http mode


Diffs (updated)
-

  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 4102d7a 
  jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 66eba1b 
  jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
  service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java 
PRE-CREATION 
  
service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java 
PRE-CREATION 
  service/src/java/org/apache/hive/service/cli/CLIService.java 2b1e712 
  service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
bfe0e7b 
  
service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 
6fbc847 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
26bda5a 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
a6ff6ce 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
e77f043 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 dc89de1 
  shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
9e9a60d 
  
shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 03f4e51 

Diff: https://reviews.apache.org/r/18185/diff/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Updated] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-02-25 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-4764:
---

Attachment: HIVE-4764.2.patch

 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-02-25 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-4764:
---

Status: Patch Available  (was: Open)

 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

2014-02-25 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911518#comment-13911518
 ] 

Vaibhav Gumashta commented on HIVE-6486:


[~shivshi]: Thanks a lot for the patch! Can you create a review link on the 
apache review board as well (https://reviews.apache.org/). It's very easy to 
browse through the code changes there. Let me know if you need any help in 
doing that. Thanks!

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch, 
 TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6484) HiveServer2 doAs should be session aware both for secured and unsecured session implementation.

2014-02-25 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911522#comment-13911522
 ] 

Vaibhav Gumashta commented on HIVE-6484:


[~navis]: Thanks for linking! That jira had slipped out of my radar. 

 HiveServer2 doAs should be session aware both for secured and unsecured 
 session implementation.
 ---

 Key: HIVE-6484
 URL: https://issues.apache.org/jira/browse/HIVE-6484
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 Currently in unsecured case, the doAs is performed by decorating 
 TProcessor.process method. This has been causing cleanup issues as we end up 
 creating a new clientUgi for each request rather than for each session. This 
 also cleans up the code.
 [~thejas] Probably you can add more if you've seen other issues related to 
 this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6418) MapJoinRowContainer has large memory overhead in typical cases

2014-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911542#comment-13911542
 ] 

Hive QA commented on HIVE-6418:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630604/HIVE-6418.05.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5170 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.org.apache.hive.service.cli.thrift.TestThriftHttpCLIService
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1488/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1488/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630604

 MapJoinRowContainer has large memory overhead in typical cases
 --

 Key: HIVE-6418
 URL: https://issues.apache.org/jira/browse/HIVE-6418
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6418.01.patch, HIVE-6418.02.patch, 
 HIVE-6418.03.patch, HIVE-6418.04.patch, HIVE-6418.04.patch, 
 HIVE-6418.05.patch, HIVE-6418.WIP.patch, HIVE-6418.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6412) SMB join on Decimal columns causes cast exception in JoinUtil.computeKeys

2014-02-25 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911568#comment-13911568
 ] 

Remus Rusanu commented on HIVE-6412:


I concur, this seems to no longer repro in current trunk.

 SMB join on Decimal columns causes cast exception in JoinUtil.computeKeys
 -

 Key: HIVE-6412
 URL: https://issues.apache.org/jira/browse/HIVE-6412
 Project: Hive
  Issue Type: Bug
Reporter: Remus Rusanu
Assignee: Xuefu Zhang
Priority: Critical

 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable cannot be cast to 
 org.apache.hadoop.hive.common.type.HiveDecimal
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveWritableObject(JavaHiveDecimalObjectInspector.java:49)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveWritableObject(JavaHiveDecimalObjectInspector.java:27)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:281)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.computeKeys(JoinUtil.java:143)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.next(SMBMapJoinOperator.java:809)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.nextHive(SMBMapJoinOperator.java:771)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.setupContext(SMBMapJoinOperator.java:710)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.setUpFetchContexts(SMBMapJoinOperator.java:538)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:248)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 {code}
 Repro:
 {code}
 create table vsmb_bucket_1(key decimal(9,0), value decimal(38,10)) 
   CLUSTERED BY (key) 
   SORTED BY (key) INTO 1 BUCKETS 
   STORED AS ORC;
 create table vsmb_bucket_2(key decimal(19,3), value decimal(28,0)) 
   CLUSTERED BY (key) 
   SORTED BY (key) INTO 1 BUCKETS 
   STORED AS ORC;
   
 insert into table vsmb_bucket_1 
   select cast(cint as decimal(9,0)) as key, 
 cast(cfloat as decimal(38,10)) as value 
   from alltypesorc limit 2;
 insert into table vsmb_bucket_2 
   select cast(cint as decimal(19,3)) as key, 
 cast(cfloat as decimal(28,0)) as value 
   from alltypesorc limit 2;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.auto.convert.sortmerge.join.noconditionaltask = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 explain
 select /*+MAPJOIN(a)*/ * from vsmb_bucket_1 a join vsmb_bucket_2 b on a.key = 
 b.key;
 select /*+MAPJOIN(a)*/ * from vsmb_bucket_1 a join vsmb_bucket_2 b on a.key = 
 b.key;
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6502) Add query for vectorized_decimal_smbjoin

2014-02-25 Thread Remus Rusanu (JIRA)
Remus Rusanu created HIVE-6502:
--

 Summary: Add query for vectorized_decimal_smbjoin
 Key: HIVE-6502
 URL: https://issues.apache.org/jira/browse/HIVE-6502
 Project: Hive
  Issue Type: Test
Reporter: Remus Rusanu
Priority: Minor


The patch for HIVE-6345 did not contain a query for SMB join because decimal 
SMB join failed (HIVE-6412). I've tested vectorized decimal SMB and it works 
fine now. This issue is the check-in vehicle for regression testing .q and 
.q.out for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HIVE-6502) Add query for vectorized_decimal_smbjoin

2014-02-25 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu reassigned HIVE-6502:
--

Assignee: Remus Rusanu

 Add query for vectorized_decimal_smbjoin
 

 Key: HIVE-6502
 URL: https://issues.apache.org/jira/browse/HIVE-6502
 Project: Hive
  Issue Type: Test
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor

 The patch for HIVE-6345 did not contain a query for SMB join because decimal 
 SMB join failed (HIVE-6412). I've tested vectorized decimal SMB and it works 
 fine now. This issue is the check-in vehicle for regression testing .q and 
 .q.out for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911585#comment-13911585
 ] 

Justin Coffey commented on HIVE-6414:
-

Hi Szehon, I worked off of the trunk on this.  We are applying cleanly to the 
latest commit and unit tests pass, but our qtest fails after the commit for 
#HIVE-5958.  qtests for parquet_create.q work just fine though.

We're digging into it.

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911587#comment-13911587
 ] 

Justin Coffey commented on HIVE-6414:
-

Oh, and we don't appear to need the order by for deterministic tests, but I 
have added it and will submit an updated patch with it (once we have gotten to 
the bottom of these failures).

btw are your qtests passing in #HIVE-6477?

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Justin Coffey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Coffey updated HIVE-6414:


Attachment: HIVE-6414.2.patch

Updated patch with working unit and qtests applicable to trunk commit: 
6010e22bd24d5004990c63f0aeb232d75693dd94 (#HIVE-5954)

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18464: Support secure Subject.doAs() in HiveServer2 JDBC client

2014-02-25 Thread Kevin Minder

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18464/
---

(Updated Feb. 25, 2014, 2:50 p.m.)


Review request for hive, Kevin Minder and Vaibhav Gumashta.


Changes
---

Added hive group


Bugs: HIVE-6486
https://issues.apache.org/jira/browse/HIVE-6486


Repository: hive-git


Description
---

Support secure Subject.doAs() in HiveServer2 JDBC client


Diffs
-

  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 17b4d39 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 379dafb 
  service/src/java/org/apache/hive/service/auth/TSubjectAssumingTransport.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/18464/diff/


Testing
---

Manual testing


Thanks,

Kevin Minder



[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

2014-02-25 Thread Kevin Minder (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911632#comment-13911632
 ] 

Kevin Minder commented on HIVE-6486:


Added review https://reviews.apache.org/r/18464/

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch, 
 TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-6412) SMB join on Decimal columns causes cast exception in JoinUtil.computeKeys

2014-02-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-6412.
---

Resolution: Cannot Reproduce

Close the issue as non-reproducible.

 SMB join on Decimal columns causes cast exception in JoinUtil.computeKeys
 -

 Key: HIVE-6412
 URL: https://issues.apache.org/jira/browse/HIVE-6412
 Project: Hive
  Issue Type: Bug
Reporter: Remus Rusanu
Assignee: Xuefu Zhang
Priority: Critical

 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.io.HiveDecimalWritable cannot be cast to 
 org.apache.hadoop.hive.common.type.HiveDecimal
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveWritableObject(JavaHiveDecimalObjectInspector.java:49)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveWritableObject(JavaHiveDecimalObjectInspector.java:27)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:281)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.computeKeys(JoinUtil.java:143)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.next(SMBMapJoinOperator.java:809)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.nextHive(SMBMapJoinOperator.java:771)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator$MergeQueue.setupContext(SMBMapJoinOperator.java:710)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.setUpFetchContexts(SMBMapJoinOperator.java:538)
 at 
 org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:248)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 {code}
 Repro:
 {code}
 create table vsmb_bucket_1(key decimal(9,0), value decimal(38,10)) 
   CLUSTERED BY (key) 
   SORTED BY (key) INTO 1 BUCKETS 
   STORED AS ORC;
 create table vsmb_bucket_2(key decimal(19,3), value decimal(28,0)) 
   CLUSTERED BY (key) 
   SORTED BY (key) INTO 1 BUCKETS 
   STORED AS ORC;
   
 insert into table vsmb_bucket_1 
   select cast(cint as decimal(9,0)) as key, 
 cast(cfloat as decimal(38,10)) as value 
   from alltypesorc limit 2;
 insert into table vsmb_bucket_2 
   select cast(cint as decimal(19,3)) as key, 
 cast(cfloat as decimal(28,0)) as value 
   from alltypesorc limit 2;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.auto.convert.sortmerge.join.noconditionaltask = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 explain
 select /*+MAPJOIN(a)*/ * from vsmb_bucket_1 a join vsmb_bucket_2 b on a.key = 
 b.key;
 select /*+MAPJOIN(a)*/ * from vsmb_bucket_1 a join vsmb_bucket_2 b on a.key = 
 b.key;
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases

2014-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911669#comment-13911669
 ] 

Hive QA commented on HIVE-6429:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630885/HIVE-6429.06.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5178 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1489/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1489/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630885

 MapJoinKey has large memory overhead in typical cases
 -

 Key: HIVE-6429
 URL: https://issues.apache.org/jira/browse/HIVE-6429
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, 
 HIVE-6429.03.patch, HIVE-6429.04.patch, HIVE-6429.05.patch, 
 HIVE-6429.06.patch, HIVE-6429.WIP.patch, HIVE-6429.patch


 The only thing that MJK really needs it hashCode and equals (well, and 
 construction), so there's no need to have array of writables in there. 
 Assuming all the keys for a table have the same structure, for the common 
 case where keys are primitive types, we can store something like a byte array 
 combination of keys to reduce the memory usage. Will probably speed up 
 compares too.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-4198) Move HCatalog code into Hive

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4198.


   Resolution: Fixed
Fix Version/s: 0.11.0

 Move HCatalog code into Hive
 

 Key: HIVE-4198
 URL: https://issues.apache.org/jira/browse/HIVE-4198
 Project: Hive
  Issue Type: Task
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.11.0


 The HCatalog code needs to be moved into Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

2014-02-25 Thread Shivaraju Gowda (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911810#comment-13911810
 ] 

Shivaraju Gowda commented on HIVE-6486:
---

Thanks Kevin for the review.
Vaibhav, Let me know if you need any information or clarification on it.

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch, 
 TestHive_SujectDoAs.java


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6495) TableDesc.getDeserializer() should use correct classloader when calling Class.forName()

2014-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911816#comment-13911816
 ] 

Ashutosh Chauhan commented on HIVE-6495:


(+)1

 TableDesc.getDeserializer() should use correct classloader when calling 
 Class.forName()
 ---

 Key: HIVE-6495
 URL: https://issues.apache.org/jira/browse/HIVE-6495
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6495.1.patch


 User is getting an error with the following stack trace below.  It looks like 
 when Class.forName() is called, it may not be using the correct class loader 
 (JavaUtils.getClassLoader() is used in other contexts when the loaded jar may 
 be required).
 {noformat}
 FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
 Failed with exception java.lang.ClassNotFoundException: 
 my.serde.ColonSerdejava.lang.RuntimeException: 
 java.lang.ClassNotFoundException: my.serde.ColonSerde
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:231)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:608)
 at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:80)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:497)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.ClassNotFoundException: my.serde.ColonSerde
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:190)
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:66)
 ... 20 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HIVE-6495) TableDesc.getDeserializer() should use correct classloader when calling Class.forName()

2014-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911816#comment-13911816
 ] 

Ashutosh Chauhan edited comment on HIVE-6495 at 2/25/14 6:13 PM:
-

+1


was (Author: ashutoshc):
(+)1

 TableDesc.getDeserializer() should use correct classloader when calling 
 Class.forName()
 ---

 Key: HIVE-6495
 URL: https://issues.apache.org/jira/browse/HIVE-6495
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6495.1.patch


 User is getting an error with the following stack trace below.  It looks like 
 when Class.forName() is called, it may not be using the correct class loader 
 (JavaUtils.getClassLoader() is used in other contexts when the loaded jar may 
 be required).
 {noformat}
 FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
 Failed with exception java.lang.ClassNotFoundException: 
 my.serde.ColonSerdejava.lang.RuntimeException: 
 java.lang.ClassNotFoundException: my.serde.ColonSerde
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:231)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:608)
 at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:80)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:497)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.ClassNotFoundException: my.serde.ColonSerde
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:190)
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:66)
 ... 20 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Attachment: HIVE-5843-src-only.6.patch

Latest version of the code minus the generated files.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Attachment: HIVE-5843.6.patch

Latest version of the code.  This has been merged with trunk and should be 
ready for review and hopefully commit.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Status: Patch Available  (was: Open)

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-25 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911922#comment-13911922
 ] 

Szehon Ho commented on HIVE-6375:
-

[~xuefuz] Can you please take a look?  Thanks.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.2.patch, HIVE-6375.3.patch, HIVE-6375.4.patch, 
 HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 18478: HIVE-6459: Change the precison/scale for intermediate sum result in the avg() udf

2014-02-25 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18478/
---

Review request for hive.


Bugs: HIVE-6459
https://issues.apache.org/jira/browse/HIVE-6459


Repository: hive-git


Description
---

Patch addressed the issue by keeping the type of the sum field consistent with 
that of sum UDF. The type of the final avg result is unchanged.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java
 6f593f9 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java 
abd54be 
  ql/src/test/queries/clientpositive/vector_decimal_aggregate.q eb9146e 
  ql/src/test/results/clientpositive/create_genericudaf.q.out 96fe2fa 
  ql/src/test/results/clientpositive/decimal_precision.q.out a80695c 
  ql/src/test/results/clientpositive/decimal_udf.q.out 74ae554 
  ql/src/test/results/clientpositive/groupby10.q.out 341427f 
  ql/src/test/results/clientpositive/groupby3.q.out a74f2b5 
  ql/src/test/results/clientpositive/groupby3_map.q.out 9424071 
  ql/src/test/results/clientpositive/groupby3_map_multi_distinct.q.out 9bcd7c9 
  ql/src/test/results/clientpositive/groupby3_map_skew.q.out f438f89 
  ql/src/test/results/clientpositive/groupby_grouping_sets3.q.out 310a202 
  ql/src/test/results/clientpositive/limit_pushdown.q.out a8add4c 
  ql/src/test/results/clientpositive/subquery_in.q.out 48be22b 
  ql/src/test/results/clientpositive/subquery_in_having.q.out ef3dc18 
  ql/src/test/results/clientpositive/subquery_notin.q.out b2d687b 
  ql/src/test/results/clientpositive/subquery_notin_having.q.out 5f4d96e 
  ql/src/test/results/clientpositive/udaf_number_format.q.out 339ef94 
  ql/src/test/results/clientpositive/udf3.q.out 546f949 
  ql/src/test/results/clientpositive/udf8.q.out 79c3bff 
  ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 8b73971 
  ql/src/test/results/clientpositive/vectorization_limit.q.out 51a4e81 
  ql/src/test/results/clientpositive/vectorization_pushdown.q.out df474d6 
  ql/src/test/results/clientpositive/vectorization_short_regress.q.out 07accb6 
  ql/src/test/results/clientpositive/vectorized_mapjoin.q.out 9590642 
  ql/src/test/results/clientpositive/vectorized_shufflejoin.q.out 928bc82 
  ql/src/test/results/compiler/plan/groupby3.q.xml cc88d5c 

Diff: https://reviews.apache.org/r/18478/diff/


Testing
---

Existing tests cover this. Some test output is regenerated due to the output 
diff.


Thanks,

Xuefu Zhang



[jira] [Updated] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows

2014-02-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5176:
---

Assignee: Jason Dere  (was: Sushanth Sowmyan)

 Wincompat : Changes for allowing various path compatibilities with Windows
 --

 Key: HIVE-5176
 URL: https://issues.apache.org/jira/browse/HIVE-5176
 Project: Hive
  Issue Type: Sub-task
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Jason Dere
 Attachments: HIVE-5176.2.patch, HIVE-5176.3.patch, HIVE-5176.patch


 We need to make certain changes across the board to allow us to read/parse 
 windows paths. Some are escaping changes, some are being strict about how we 
 read paths (through URL.encode/decode, etc)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911974#comment-13911974
 ] 

Szehon Ho commented on HIVE-6414:
-

Hmm, I applied your patch on trunk, and new test (parquet_types) still fails 
for me with missing output due to HIVE-5958.  Let's see how pre-commit tests 
go.  Yea my tests pass pre-commit test in HIVE-6477, I had added regeneration 
of output.

Other than that, +1 (non-binding).  Thanks for doing order-by, from my 
experience its useful for group by, as each group goes to one reducer, and no 
guarantee from MR framework that they wont run in parallel.



 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911979#comment-13911979
 ] 

Xuefu Zhang commented on HIVE-6375:
---

+1

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.2.patch, HIVE-6375.3.patch, HIVE-6375.4.patch, 
 HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken

2014-02-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912002#comment-13912002
 ] 

Xuefu Zhang commented on HIVE-6356:
---

[~ashutoshc] Could we put a close on this? I understood that patch v3 is 
committed to trunk. Do we still need addendum patch to be committed, in order 
to close this JIRA?

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt, 
 HIVE-6356.3.patch.txt, HIVE-6356.addendum.00.patch


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-1459) wildcards in UDF/UDAF should expand to all columns (rather than no columns)

2014-02-25 Thread Arvind Prabhakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvind Prabhakar updated HIVE-1459:
---

Assignee: (was: Arvind Prabhakar)

 wildcards in UDF/UDAF should expand to all columns (rather than no columns)
 ---

 Key: HIVE-1459
 URL: https://issues.apache.org/jira/browse/HIVE-1459
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.6.0
Reporter: John Sichi

 When a function is invoked with a wildcard * for its parameter, it should be 
 passed all of the columns in the expansion, exception in the special case of 
 COUNT, where none of the columns should be passed.
 As part of this issue, we also need to test qualified wildcards (e.g. t.*) 
 and Hive's extension for regular-expression selection of column subsets.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Status: Open  (was: Patch Available)

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.patch, HiveTransactionManagerDetailedDesign 
 (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Attachment: HIVE-5843.7.patch

New latest version of the patch.  I had forgotten to add the new thrift 
generated files to the previous version.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Status: Patch Available  (was: Open)

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912036#comment-13912036
 ] 

Lefty Leverenz commented on HIVE-5843:
--

HiveConf comment nits:

* hive.compactor.check.interval:  // Time in seconds between checks to see if 
any partitions need compacted.  --  need to be compacted.

* hive.txn.timeout:  // time after which ...  --  init cap Time

Also a question:  If this goes into Hive 0.13.0, will it be useful immediately 
or just a piece of an incomplete feature?  Thirteen new config parameters are 
added, and I'm wondering about documentation (as always).  When HIVE-6037 gets 
committed we won't need to update hive-default.xml.template anymore but the 
parameter comments will have to be moved into the definitions.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912041#comment-13912041
 ] 

Lefty Leverenz commented on HIVE-6037:
--

Parameter alert!  HIVE-5843 (transaction manager) introduces 13 new config 
params.  This has been merged with trunk and should be ready for review and 
hopefully commit.

All have definitions in the comments except the first three, which speak for 
themselves.

* HIVE-5843:  hive.txn.manager, hive.txn.driver, hive.txn.connection.string, 
hive.txn.timeout, hive.txn.max.open.batch, hive.txn.testing, 
hive.compactor.initiator.on, hive.compactor.worker.threads, 
hive.compactor.worker.timeout, hive.compactor.check.interval, 
hive.compactor.delta.num.threshold, hive.compactor.delta.pct.threshold, 
hive.compactor.abortedtxn.threshold 

 Synchronize HiveConf with hive-default.xml.template and support show conf
 -

 Key: HIVE-6037
 URL: https://issues.apache.org/jira/browse/HIVE-6037
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, 
 HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, 
 HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, HIVE-6037.2.patch.txt, 
 HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, 
 HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, 
 HIVE-6037.patch


 see HIVE-5879



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912059#comment-13912059
 ] 

Alan Gates commented on HIVE-5843:
--

I'm definitely hoping this makes it into 0.13.  And no, it isn't only 
incomplete feature.  If it was, I'd wait until after 0.13 branched.  HIVE-5687 
depends on this, and the hope is to get it into 0.13.

As for the comments in HiveConf, I didn't realize I was writing documentation 
there or I would have paid closer attention to my grammar.  However, a nit on 
your nit.  need compacted. -- need to be compacted.  Are you sure?  What 
is the grammar rule there?

Overall on documentation though, there will be a fair amount to write, 
especially once we have HIVE-6319 and HIVE-6060 there.  Should I file a 
separate JIRA outline the documentation needs?

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-02-25 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6455:
-

Attachment: HIVE-6455.8.patch

Added fix when dynamic partition context is null for bucketed tables.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, 
 HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, 
 HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18459: FS based stats.

2014-02-25 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18459/#review35452
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/18459/#comment66000

Do you need to update hive-site template + test hive-site too?



trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java
https://reviews.apache.org/r/18459/#comment65998

how does this work with task attempts? is there a chance of counting failed 
stuff?



trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java
https://reviews.apache.org/r/18459/#comment65977

this would be easier to debug if the exception gets logged at a higher 
level (error/warn/exception) - multiple instances in both new files.


- Gunther Hagleitner


On Feb. 25, 2014, 8:09 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18459/
 ---
 
 (Updated Feb. 25, 2014, 8:09 a.m.)
 
 
 Review request for hive and Navis Ryu.
 
 
 Bugs: HIVE-6500
 https://issues.apache.org/jira/browse/HIVE-6500
 
 
 Repository: hive
 
 
 Description
 ---
 
 FS based stats collection.
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
 1571554 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregator.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregatorTez.java
  1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsPublisher.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsCollectionTaskIndependent.java
  PRE-CREATION 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
 PRE-CREATION 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
 PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/statsfs.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/statsfs.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18459/diff/
 
 
 Testing
 ---
 
 Added new tests.
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-02-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912103#comment-13912103
 ] 

Gunther Hagleitner commented on HIVE-6500:
--

Small comments/question on rb.

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18185: Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-02-25 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18185/#review35436
---



jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65958

lets track this TODO in a jira. It is not very useful comment here (ie not 
something like warning against an unimplemented part or so)




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65957

It will be better to keep the position of createBinaryTransport and 
createHttpTransport same as before. That way the diff will be smaller and 
easier to read. Also, git blame will remain an useful tool for analyzing 
changes (it would be easier to find which line in createBinaryTransport changed 
when with it).




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65959

this variable is not being used anywhere




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65960

I am probably being too opinionated here! Feel free to disagree (if you 
do). I don't think we need this no-argument method, we can just use the method 
with single boolean argument. I think that will be more readable.





jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java
https://reviews.apache.org/r/18185/#comment65976

Can you add a class level comment ?




service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java
https://reviews.apache.org/r/18185/#comment65989

can you add a class comment ?Something like utility functions for http 
mode authentication.
 Maybe call this class HttpAuthUtils, so that its more clear what it 
contains ?






service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java
https://reviews.apache.org/r/18185/#comment65993

can you add a class comment ?



service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java
https://reviews.apache.org/r/18185/#comment65994

can you add a class comment ?



service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java
https://reviews.apache.org/r/18185/#comment66002

I think the better place to clear this is in ThriftHttpServlet, after the 
call to super.doPost(request, response), as it is set in the same place.





shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
https://reviews.apache.org/r/18185/#comment65988

This is duplicating code in createClientTransport . Should we move this 
code to a static util class and re-use in both places ?




shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
https://reviews.apache.org/r/18185/#comment65987

wouldn't it be sufficient to use HadoopShims.getUGIForConf() instead of new 
method in thrift shims ?



- Thejas Nair


On Feb. 25, 2014, 12:23 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18185/
 ---
 
 (Updated Feb. 25, 2014, 12:23 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-4764
 https://issues.apache.org/jira/browse/HIVE-4764
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 4102d7a 
   jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 66eba1b 
   jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
   service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java
  PRE-CREATION 
   service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/cli/CLIService.java 2b1e712 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 bfe0e7b 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  6fbc847 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 26bda5a 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 a6ff6ce 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 e77f043 
   
 shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
  dc89de1 
   

[jira] [Updated] (HIVE-6389) LazyBinaryColumnarSerDe-based RCFile tables break when looking up elements in null-maps.

2014-02-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-6389:
---

Status: Patch Available  (was: Open)

 LazyBinaryColumnarSerDe-based RCFile tables break when looking up elements in 
 null-maps.
 

 Key: HIVE-6389
 URL: https://issues.apache.org/jira/browse/HIVE-6389
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.13.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: Hive-6389.patch


 RCFile tables that use the LazyBinaryColumnarSerDe don't seem to handle 
 look-ups into map-columns when the value of the column is null.
 When an RCFile table is created with LazyBinaryColumnarSerDe (as is default 
 in 0.12), and queried as follows:
 {code}
 select mymap['1024'] from mytable;
 {code}
 and if the mymap column has nulls, then one is treated to the following 
 guttural utterance:
 {code}
 2014-02-05 21:50:25,050 FATAL mr.ExecMapper (ExecMapper.java:map(194)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {id:null,mymap:null,isnull:null}
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
 org.apache.hadoop.io.Text
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(WritableStringObjectInspector.java:41)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:226)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:486)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:439)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:423)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:560)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
   ... 10 more
 {code}
 A patch is on the way, but the short of it is that the LazyBinaryMapOI needs 
 to return nulls if either the map or the lookup-key is null.
 This is handled correctly for Text data, and for RCFiles using ColumnarSerDe.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6389) LazyBinaryColumnarSerDe-based RCFile tables break when looking up elements in null-maps.

2014-02-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-6389:
---

Status: Open  (was: Patch Available)

 LazyBinaryColumnarSerDe-based RCFile tables break when looking up elements in 
 null-maps.
 

 Key: HIVE-6389
 URL: https://issues.apache.org/jira/browse/HIVE-6389
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.13.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: Hive-6389.patch


 RCFile tables that use the LazyBinaryColumnarSerDe don't seem to handle 
 look-ups into map-columns when the value of the column is null.
 When an RCFile table is created with LazyBinaryColumnarSerDe (as is default 
 in 0.12), and queried as follows:
 {code}
 select mymap['1024'] from mytable;
 {code}
 and if the mymap column has nulls, then one is treated to the following 
 guttural utterance:
 {code}
 2014-02-05 21:50:25,050 FATAL mr.ExecMapper (ExecMapper.java:map(194)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {id:null,mymap:null,isnull:null}
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
 org.apache.hadoop.io.Text
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(WritableStringObjectInspector.java:41)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:226)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:486)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:439)
   at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:423)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:560)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
   ... 10 more
 {code}
 A patch is on the way, but the short of it is that the LazyBinaryMapOI needs 
 to return nulls if either the map or the lookup-key is null.
 This is handled correctly for Text data, and for RCFiles using ColumnarSerDe.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6499) Using Metastore-side Auth errors on non-resolvable IF/OF/SerDe

2014-02-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912174#comment-13912174
 ] 

Thejas M Nair commented on HIVE-6499:
-

+1

 Using Metastore-side Auth errors on non-resolvable IF/OF/SerDe
 --

 Key: HIVE-6499
 URL: https://issues.apache.org/jira/browse/HIVE-6499
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6499.patch


 In cases where a user needs to use a custom IF/OF/SerDe that is not 
 accessible from the metastore, calls like msc.createTable and msc.dropTable 
 should still work without being able to load the class. This is possible as 
 long as one does not enable MetaStore-side authorization, at which point this 
 becomes impossible, erroring out with a ClassNotFoundException.
 The reason this happens is that since the AuthorizationProvider interface is 
 defined against a ql.metadata.Table, we wind up needing to instantiate a 
 ql.metadata.Table object, which, in its constructor tries to instantiate 
 IF/OF/SerDe elements in an attempt to pre-load those fields. And if we do not 
 have access to those classes in the metastore, this is when that fails. The 
 constructor/initialize methods of Table and Partition do not really need to 
 pre-initialize these fields, since the fields are accessed only through the 
 accessor, and will be instantiated on first-use.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6503) document pluggable authentication modules (PAM) in template config, wiki

2014-02-25 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6503:
---

 Summary: document pluggable authentication modules (PAM) in 
template config, wiki
 Key: HIVE-6503
 URL: https://issues.apache.org/jira/browse/HIVE-6503
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
Priority: Blocker
 Fix For: 0.13.0


HIVE-6466 adds support for PAM as a supported value for 
hive.server2.authentication. 
It also adds a config parameter hive.server2.authentication.pam.services.

The default template file needs to be updated to document these. The wiki docs 
should also document the support for pluggable authentication modules.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive

2014-02-25 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6466:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I have committed this to trunk.
Thanks for the contribution [~vgumashta]!. Thanks for reviewing [~kamrul].

I realized post commit that this does not update the default.xml.template file. 
I have filed a blocker jira to track that - HIVE-6503 . Vaibhav, can you please 
address that when you get a chance ? We should fix that in 0.13.


 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5970) ArrayIndexOutOfBoundsException in RunLengthIntegerReaderV2.java

2014-02-25 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912231#comment-13912231
 ] 

Prasanth J commented on HIVE-5970:
--

I just verified with the attached test data. This bug is solved by HIVE-6382.

 ArrayIndexOutOfBoundsException in RunLengthIntegerReaderV2.java
 ---

 Key: HIVE-5970
 URL: https://issues.apache.org/jira/browse/HIVE-5970
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.12.0
Reporter: Eric Chu
Priority: Critical
  Labels: orcfile
 Attachments: test_data


 A workload involving ORC tables starts getting the following 
 ArrayIndexOutOfBoundsException AFTER the upgrade to Hive 0.12. The file is 
 added as part of HIVE-4123. 
 2013-12-04 14:42:08,537 ERROR 
 cause:java.io.IOException: java.io.IOException: 
 java.lang.ArrayIndexOutOfBoundsException: 0
 2013-12-04 14:42:08,537 WARN org.apache.hadoop.mapred.Child: Error running 
 child
 java.io.IOException: java.io.IOException: 
 java.lang.ArrayIndexOutOfBoundsException: 0
 at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:215)
 at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:200)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
 at org.apache.hadoop.mapred.Child.main(Child.java:262)
 Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0
 at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
 at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
 at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
 at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
 at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
 ... 11 more
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
 at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readPatchedBaseValues(RunLengthIntegerReaderV2.java:171)
 at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54)
 at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287)
 at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:473)
 at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1157)
 at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2196)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:129)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:80)
 at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
 ... 15 more



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive

2014-02-25 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912235#comment-13912235
 ] 

Vaibhav Gumashta commented on HIVE-6466:


[~thejas] Thanks for the review! Sure, I'll resolve HIVE-6503 by the end of 
week.

 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-02-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912237#comment-13912237
 ] 

Gunther Hagleitner commented on HIVE-6455:
--

This is cool. Still reviewing but some ideas:

- Instead of adding a column to the record to be used in the file sink, it'd be 
cleaner (and faster) to use the key to determine new files. I believe that 
could be achieved through startGroup/endGroup
- Looks like we'd end up duplicating partition column, bucket column, sort 
columns in both key and value on the reduce sink. It might be possible to avoid 
that, making the intermediate output smaller. Although I'm not sure this would 
require additional changes to rebuild the row in the reduce task.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, 
 HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, 
 HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6504) Refactor JDBC HiveConnection to use a factory to create client transport.

2014-02-25 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6504:
--

 Summary: Refactor JDBC HiveConnection to use a factory to create 
client transport.
 Key: HIVE-6504
 URL: https://issues.apache.org/jira/browse/HIVE-6504
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


The client transport creation is quite messy. Need to clean it up.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5504) OrcOutputFormat honors compression properties only from within hive

2014-02-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5504:
---

Attachment: HIVE-5504.2.patch

Updated patch per reviewboard comments.

 OrcOutputFormat honors  compression  properties only from within hive
 -

 Key: HIVE-5504
 URL: https://issues.apache.org/jira/browse/HIVE-5504
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0
Reporter: Venkat Ranganathan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5504.2.patch, HIVE-5504.patch


 When we import data into a HCatalog table created with the following storage  
 description
 .. stored as orc tblproperties (orc.compress=SNAPPY) 
 the resultant orc file still uses the default zlib compression
 It looks like HCatOutputFormat is ignoring the tblproperties specified.   
 show tblproperties shows that the table indeed has the properties properly 
 saved.
 An insert/select into the table has the resulting orc file honor the tbl 
 property.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6393) Support unqualified column references in Joining conditions

2014-02-25 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912261#comment-13912261
 ] 

Gunther Hagleitner commented on HIVE-6393:
--

[~rhbutani] could you open a review board request for this one?

 Support unqualified column references in Joining conditions
 ---

 Key: HIVE-6393
 URL: https://issues.apache.org/jira/browse/HIVE-6393
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch


 Support queries of the form:
 {noformat}
 create table r1(a int);
 create table r2(b);
 select a, b
 from r1 join r2 on a = b
 {noformat}
 This becomes more useful in old style syntax:
 {noformat}
 select a, b
 from r1, r2
 where a = b
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6393) Support unqualified column references in Joining conditions

2014-02-25 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912264#comment-13912264
 ] 

Harish Butani commented on HIVE-6393:
-

https://reviews.apache.org/r/18293/

 Support unqualified column references in Joining conditions
 ---

 Key: HIVE-6393
 URL: https://issues.apache.org/jira/browse/HIVE-6393
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6393.1.patch, HIVE-6393.2.patch


 Support queries of the form:
 {noformat}
 create table r1(a int);
 create table r2(b);
 select a, b
 from r1 join r2 on a = b
 {noformat}
 This becomes more useful in old style syntax:
 {noformat}
 select a, b
 from r1, r2
 where a = b
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 18492: HIVE-6473: Allow writing HFiles via HBaseStorageHandler table

2014-02-25 Thread nick dimiduk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18492/
---

Review request for hive.


Bugs: HIVE-6473
https://issues.apache.org/jira/browse/HIVE-6473


Repository: hive-git


Description
---

From the JIRA:

Generating HFiles for bulkload into HBase could be more convenient. Right now 
we require the user to register a new table with the appropriate output format. 
This patch allows the exact same functionality, but through an existing table 
managed by the HBaseStorageHandler.


Diffs
-

  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 
8cd594b 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java 
6d383b5 
  hbase-handler/src/test/queries/negative/generatehfiles_require_family_path.q 
PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_bulk.m f8bb47d 
  hbase-handler/src/test/queries/positive/hbase_bulk.q PRE-CREATION 
  hbase-handler/src/test/queries/positive/hbase_handler_bulk.q PRE-CREATION 
  
hbase-handler/src/test/results/negative/generatehfiles_require_family_path.q.out
 PRE-CREATION 
  hbase-handler/src/test/results/positive/hbase_handler_bulk.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/18492/diff/


Testing
---


Thanks,

nick dimiduk



[jira] [Commented] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-02-25 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912291#comment-13912291
 ] 

Nick Dimiduk commented on HIVE-6473:


Sure think. I've opened https://reviews.apache.org/r/18492/ .

 Allow writing HFiles via HBaseStorageHandler table
 --

 Key: HIVE-6473
 URL: https://issues.apache.org/jira/browse/HIVE-6473
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6473.0.patch.txt


 Generating HFiles for bulkload into HBase could be more convenient. Right now 
 we require the user to register a new table with the appropriate output 
 format. This patch allows the exact same functionality, but through an 
 existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5504) OrcOutputFormat honors compression properties only from within hive

2014-02-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912302#comment-13912302
 ] 

Thejas M Nair commented on HIVE-5504:
-

+1

 OrcOutputFormat honors  compression  properties only from within hive
 -

 Key: HIVE-5504
 URL: https://issues.apache.org/jira/browse/HIVE-5504
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0
Reporter: Venkat Ranganathan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5504.2.patch, HIVE-5504.patch


 When we import data into a HCatalog table created with the following storage  
 description
 .. stored as orc tblproperties (orc.compress=SNAPPY) 
 the resultant orc file still uses the default zlib compression
 It looks like HCatOutputFormat is ignoring the tblproperties specified.   
 show tblproperties shows that the table indeed has the properties properly 
 saved.
 An insert/select into the table has the resulting orc file honor the tbl 
 property.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6456) Implement Parquet schema evolution

2014-02-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6456:
--

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks to Brock for the patch.

 Implement Parquet schema evolution
 --

 Key: HIVE-6456
 URL: https://issues.apache.org/jira/browse/HIVE-6456
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Trivial
 Fix For: 0.13.0

 Attachments: HIVE-6456.patch


 In HIVE-5783 we removed schema evolution:
 https://github.com/Parquet/parquet-mr/pull/297/files#r9824155



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6505) Make stats optimizer more robust in presence of distinct clause

2014-02-25 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-6505:
--

 Summary: Make stats optimizer more robust in presence of distinct 
clause
 Key: HIVE-6505
 URL: https://issues.apache.org/jira/browse/HIVE-6505
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Currently it throws exceptions in few cases.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6505) Make stats optimizer more robust in presence of distinct clause

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6505:
---

Attachment: HIVE-6505.patch

More checks to make sure stats optimizer fires correctly.

 Make stats optimizer more robust in presence of distinct clause
 ---

 Key: HIVE-6505
 URL: https://issues.apache.org/jira/browse/HIVE-6505
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6505.patch


 Currently it throws exceptions in few cases.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6505) Make stats optimizer more robust in presence of distinct clause

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6505:
---

Status: Patch Available  (was: In Progress)

 Make stats optimizer more robust in presence of distinct clause
 ---

 Key: HIVE-6505
 URL: https://issues.apache.org/jira/browse/HIVE-6505
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6505.patch


 Currently it throws exceptions in few cases.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HIVE-6505) Make stats optimizer more robust in presence of distinct clause

2014-02-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6505 started by Ashutosh Chauhan.

 Make stats optimizer more robust in presence of distinct clause
 ---

 Key: HIVE-6505
 URL: https://issues.apache.org/jira/browse/HIVE-6505
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6505.patch


 Currently it throws exceptions in few cases.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 18494: Fix stats optimizer for distinct clause.

2014-02-25 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18494/
---

Review request for hive.


Bugs: HIVE-6505
https://issues.apache.org/jira/browse/HIVE-6505


Repository: hive-git


Description
---

Fix stats optimizer for distinct clause.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 1d23449 
  ql/src/test/queries/clientpositive/distinct_stats.q PRE-CREATION 
  ql/src/test/results/clientpositive/distinct_stats.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/18494/diff/


Testing
---

Added new .q test


Thanks,

Ashutosh Chauhan



[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912318#comment-13912318
 ] 

Xuefu Zhang commented on HIVE-6414:
---

Quick comment on the code change:

Hive doesn't throw runtime exception when the data isn't right. In terms of 
error handling, Hive returns null for data errors, including data-out-of-bound 
as in this case.

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-02-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912320#comment-13912320
 ] 

Hive QA commented on HIVE-6414:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630955/HIVE-6414.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5196 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1491/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1491/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630955

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from 

[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-02-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912323#comment-13912323
 ] 

Thejas M Nair commented on HIVE-6037:
-

I think it is better to commit this now, rather than later.
When will you be able to rebase ? Once its ready and reviewed we should commit 
this one without waiting for another 24 hours (so that it does not go stale). 
Maybe have more than one committer review it instead. What to people thing ?




 Synchronize HiveConf with hive-default.xml.template and support show conf
 -

 Key: HIVE-6037
 URL: https://issues.apache.org/jira/browse/HIVE-6037
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, 
 HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, 
 HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, HIVE-6037.2.patch.txt, 
 HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, 
 HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, 
 HIVE-6037.patch


 see HIVE-5879



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6506) hcatalog should automatically work with new tableproperties in ORC

2014-02-25 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6506:
---

 Summary: hcatalog should automatically work with new 
tableproperties in ORC
 Key: HIVE-6506
 URL: https://issues.apache.org/jira/browse/HIVE-6506
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Thejas M Nair


HIVE-5504 has changes to handle existing table properties for ORC file format. 
But it does not automatically pick newly added table properties. We should 
refactor ORC so that its table property list can be automatically determined.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6506) hcatalog should automatically work with new tableproperties in ORC

2014-02-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912339#comment-13912339
 ] 

Thejas M Nair commented on HIVE-6506:
-

ORC should return list of table properties names (maybe use an enum instead of 
'final string), and hcatalog should add that list to jobconf.


 hcatalog should automatically work with new tableproperties in ORC
 --

 Key: HIVE-6506
 URL: https://issues.apache.org/jira/browse/HIVE-6506
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Thejas M Nair

 HIVE-5504 has changes to handle existing table properties for ORC file 
 format. But it does not automatically pick newly added table properties. We 
 should refactor ORC so that its table property list can be automatically 
 determined.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6434) Restrict function create/drop to admin roles

2014-02-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912352#comment-13912352
 ] 

Thejas M Nair commented on HIVE-6434:
-

I think we should require admin privileges for temporary functions as well. 
This is not a backward compatibility issue as the requirement would apply only 
if the new sql standard auth is enabled.


 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6507) OrcFile table property names are specified as strings

2014-02-25 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-6507:
--

 Summary: OrcFile table property names are specified as strings
 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


In HIVE-5504, we had to do some special casing in HCatalog to add a particular 
set of orc table properties from table properties to job properties.

In doing so, it's obvious that that is a bit cumbersome, and ideally, the list 
of all orc file table properties should really be an enum, rather than 
individual loosely tied constant strings. If we were to clean this up, we can 
clean up other code that references this to reference the entire enum, and 
avoid future errors when new table properties are introduced, but other 
referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6507) OrcFile table property names are specified as strings

2014-02-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6507:
---

Attachment: HIVE-6507.patch

Attaching patch. This applies on top of HIVE-5504, and depends on that being 
committed first.

 OrcFile table property names are specified as strings
 -

 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6507.patch


 In HIVE-5504, we had to do some special casing in HCatalog to add a 
 particular set of orc table properties from table properties to job 
 properties.
 In doing so, it's obvious that that is a bit cumbersome, and ideally, the 
 list of all orc file table properties should really be an enum, rather than 
 individual loosely tied constant strings. If we were to clean this up, we can 
 clean up other code that references this to reference the entire enum, and 
 avoid future errors when new table properties are introduced, but other 
 referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6137) Hive should report that the file/path doesn’t exist when it doesn’t (it now reports SocketTimeoutException)

2014-02-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6137:


Attachment: HIVE-6137.2.patch

cc-ing [~thejas] for review. I have made changes in HiveMetaStore to throw a  
MetaException which gets caught at the client side.

 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException)
 ---

 Key: HIVE-6137
 URL: https://issues.apache.org/jira/browse/HIVE-6137
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6137.1.patch, HIVE-6137.2.patch


 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException):
 Execute a Hive DDL query with a reference to a non-existent blob (such as 
 CREATE EXTERNAL TABLE...) and check Hive logs (stderr):
 FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
 java.net.SocketTimeoutException: Read timed out
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 This error message is not intuitive. If a file doesn't exist, Hive should 
 report FileNotFoundException



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-6506) hcatalog should automatically work with new tableproperties in ORC

2014-02-25 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved HIVE-6506.
-

Resolution: Duplicate

Resolving as duplicate


 hcatalog should automatically work with new tableproperties in ORC
 --

 Key: HIVE-6506
 URL: https://issues.apache.org/jira/browse/HIVE-6506
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Thejas M Nair

 HIVE-5504 has changes to handle existing table properties for ORC file 
 format. But it does not automatically pick newly added table properties. We 
 should refactor ORC so that its table property list can be automatically 
 determined.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912361#comment-13912361
 ] 

Lefty Leverenz commented on HIVE-5843:
--

Thanks Alan, a doc JIRA seems like a good idea for this.

About the nit, I'm sure that partitions need compacted sounds wrong -- maybe 
you meant need compaction or maybe I misunderstood the concept.  I read it as 
similar to The dishes in the sink need cleaned vs. need to be cleaned or 
need cleaning.  Not so?

But I can't cite a grammar rule without doing some research.  So far I've found 
out that need can be a regular verb or a modal.  I need to look it up, but I 
need not obsess over it, respectively.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server

2014-02-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912370#comment-13912370
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-5099:
-

non-binding +1. cc-ing [~ashutoshc], [~thejas] for review. HIVE-5218 should be 
marked as a duplicate of this jira, since this upgrades datanucleus to an ever 
newer version.

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 

[jira] [Updated] (HIVE-6507) OrcFile table property names are specified as strings

2014-02-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6507:
---

Affects Version/s: 0.13.0

 OrcFile table property names are specified as strings
 -

 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6507.patch


 In HIVE-5504, we had to do some special casing in HCatalog to add a 
 particular set of orc table properties from table properties to job 
 properties.
 In doing so, it's obvious that that is a bit cumbersome, and ideally, the 
 list of all orc file table properties should really be an enum, rather than 
 individual loosely tied constant strings. If we were to clean this up, we can 
 clean up other code that references this to reference the entire enum, and 
 avoid future errors when new table properties are introduced, but other 
 referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6507) OrcFile table property names are specified as strings

2014-02-25 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6507:
---

Component/s: Serializers/Deserializers
 HCatalog

 OrcFile table property names are specified as strings
 -

 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6507.patch


 In HIVE-5504, we had to do some special casing in HCatalog to add a 
 particular set of orc table properties from table properties to job 
 properties.
 In doing so, it's obvious that that is a bit cumbersome, and ideally, the 
 list of all orc file table properties should really be an enum, rather than 
 individual loosely tied constant strings. If we were to clean this up, we can 
 clean up other code that references this to reference the entire enum, and 
 avoid future errors when new table properties are introduced, but other 
 referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6434) Restrict function create/drop to admin roles

2014-02-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912391#comment-13912391
 ] 

Jason Dere commented on HIVE-6434:
--

Ok, I can add the restriction on temp functions/macros back to the patch. 

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles

2014-02-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Release Note:   (was: Restrict function create/drop to admin roles, if sql 
std auth is enabled. This would include temp/permanent functions, as well as 
macros.
)

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles

2014-02-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Description: 
Restrict function create/drop to admin roles, if sql std auth is enabled. This 
would include temp/permanent functions, as well as macros.


 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch


 Restrict function create/drop to admin roles, if sql std auth is enabled. 
 This would include temp/permanent functions, as well as macros.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18459: FS based stats.

2014-02-25 Thread Lefty Leverenz


 On Feb. 25, 2014, 9:36 p.m., Gunther Hagleitner wrote:
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 626
  https://reviews.apache.org/r/18459/diff/1/?file=503283#file503283line626
 
  Do you need to update hive-site template + test hive-site too?

The template file will be generated from HiveConf.java after HIVE-6037 gets 
committed, so updating it would be wasted effort.  But a parameter description 
is needed, and it can go in a comment for now but once HIVE-6037 commits the 
description has to be part of the parameter definition like this example:

CLIPROMPT(hive.cli.prompt, hive,
Command line prompt configuration value. Other hiveconf can be used in 
this configuration value. \n +
Variable substitution will only be invoked at the Hive CLI startup.),


- Lefty


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18459/#review35452
---


On Feb. 25, 2014, 8:09 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18459/
 ---
 
 (Updated Feb. 25, 2014, 8:09 a.m.)
 
 
 Review request for hive and Navis Ryu.
 
 
 Bugs: HIVE-6500
 https://issues.apache.org/jira/browse/HIVE-6500
 
 
 Repository: hive
 
 
 Description
 ---
 
 FS based stats collection.
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java 
 1571554 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregator.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregatorTez.java
  1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsPublisher.java 
 1571554 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsCollectionTaskIndependent.java
  PRE-CREATION 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java 1571554 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsAggregator.java 
 PRE-CREATION 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/fs/FSStatsPublisher.java 
 PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/statsfs.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/statsfs.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18459/diff/
 
 
 Testing
 ---
 
 Added new tests.
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Commented] (HIVE-4545) HS2 should return describe table results without space padding

2014-02-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13912418#comment-13912418
 ] 

Ashutosh Chauhan commented on HIVE-4545:


+1

 HS2 should return describe table results without space padding
 --

 Key: HIVE-4545
 URL: https://issues.apache.org/jira/browse/HIVE-4545
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, 
 HIVE-4545.4.patch, HIVE-4545.5.patch


 HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE 
 FORMATTED table;'. HIVE-3140 introduced changes to not print header in 
 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for 
 the 'DESCRIBE table;' query.
 As the jdbc/odbc results are not for direct human consumption the space 
 padding should not be done for hive server2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles

2014-02-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Attachment: HIVE-6434.4.patch

Patch v4 adds back the restricting of create temp function/macro to admin 
roles. This is only in effect if sql standard auth is enabled.

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch, 
 HIVE-6434.4.patch


 Restrict function create/drop to admin roles, if sql std auth is enabled. 
 This would include temp/permanent functions, as well as macros.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18162: HIVE-6434: Restrict function create/drop to admin roles

2014-02-25 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18162/
---

(Updated Feb. 26, 2014, 2:10 a.m.)


Review request for hive and Thejas Nair.


Changes
---

Adds back the restricting of create temp function/macro to admin roles. This is 
only in effect if sql standard auth is enabled.


Bugs: HIVE-6434
https://issues.apache.org/jira/browse/HIVE-6434


Repository: hive-git


Description
---

Add output entity of DB object to make sure only admin roles can add/drop 
functions/macros.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
68a25e0 
  ql/src/java/org/apache/hadoop/hive/ql/parse/MacroSemanticAnalyzer.java 
0ae07e3 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 c43bcea 
  ql/src/test/queries/clientnegative/authorization_create_func1.q PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_create_func2.q PRE-CREATION 
  ql/src/test/queries/clientnegative/authorization_create_macro1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_create_func1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_create_macro1.q PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_create_func1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_create_func2.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_create_macro1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/cluster_tasklog_retrieval.q.out 747aa6a 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
393a3e8 
  ql/src/test/results/clientnegative/create_function_nonexistent_db.q.out 
ebb069e 
  ql/src/test/results/clientnegative/create_function_nonudf_class.q.out dd66afc 
  ql/src/test/results/clientnegative/create_udaf_failure.q.out 3fc3d36 
  ql/src/test/results/clientnegative/create_unknown_genericudf.q.out af3d50b 
  ql/src/test/results/clientnegative/create_unknown_udf_udaf.q.out e138fd0 
  ql/src/test/results/clientnegative/drop_native_udf.q.out 1913df9 
  ql/src/test/results/clientnegative/udf_function_does_not_implement_udf.q.out 
9ea8668 
  ql/src/test/results/clientnegative/udf_local_resource.q.out b6ea77d 
  ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out ad70d54 
  ql/src/test/results/clientnegative/udf_test_error.q.out a788a10 
  ql/src/test/results/clientnegative/udf_test_error_reduce.q.out 98b42e0 
  ql/src/test/results/clientpositive/authorization_create_func1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/authorization_create_macro1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/autogen_colalias.q.out a074b96 
  ql/src/test/results/clientpositive/compile_processor.q.out 7e9bb29 
  ql/src/test/results/clientpositive/create_func1.q.out 5a249c3 
  ql/src/test/results/clientpositive/create_genericudaf.q.out 96fe2fa 
  ql/src/test/results/clientpositive/create_genericudf.q.out bf1f4ac 
  ql/src/test/results/clientpositive/create_udaf.q.out 2e86a36 
  ql/src/test/results/clientpositive/create_view.q.out ecc7618 
  ql/src/test/results/clientpositive/drop_udf.q.out 422933a 
  ql/src/test/results/clientpositive/macro.q.out c483029 
  ql/src/test/results/clientpositive/ptf_register_tblfn.q.out 11c9724 
  ql/src/test/results/clientpositive/udaf_sum_list.q.out b1922d9 
  ql/src/test/results/clientpositive/udf_compare_java_string.q.out 8e6e365 
  ql/src/test/results/clientpositive/udf_context_aware.q.out 10414fa 
  ql/src/test/results/clientpositive/udf_logic_java_boolean.q.out 88c1984 
  ql/src/test/results/clientpositive/udf_testlength.q.out 4d75482 
  ql/src/test/results/clientpositive/udf_testlength2.q.out 8a1e03e 
  ql/src/test/results/clientpositive/udf_using.q.out 69e5f3b 
  ql/src/test/results/clientpositive/windowing_udaf2.q.out 5043a45 

Diff: https://reviews.apache.org/r/18162/diff/


Testing
---

positive/negative q files added


Thanks,

Jason Dere



[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-02-25 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-5843:
-

Status: Open  (was: Patch Available)

Found an NPE that shows up on a cluster but not in .q file tests.  Will post a 
new version of the patch soon.

 Transaction manager for Hive
 

 Key: HIVE-5843
 URL: https://issues.apache.org/jira/browse/HIVE-5843
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.12.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0

 Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
 HIVE-5843-src-only.patch, HIVE-5843.2.patch, HIVE-5843.3-src.path, 
 HIVE-5843.3.patch, HIVE-5843.4-src.patch, HIVE-5843.4.patch, 
 HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.patch, 
 HiveTransactionManagerDetailedDesign (1).pdf


 As part of the ACID work proposed in HIVE-5317 a transaction manager is 
 required.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >