date:20140221

Hi,

Storage handlers muddle the waters a bit IMO. That interface was
written for storage that is not file-based, e.g. hbase. Whereas Avro,
Parquet, Sequence File, etc are all file based.

I think we have to be practical about confusion. There are so many
Hadoop newbies out there, almost all of them new to Apache as well,
that there is going to be some confusion. For example, one person who
had been using Hadoop and Hive for a few months said to me Hive moved
*from* Apache to Hortonworks. At the end of the day, regardless of
what we do, some level of confusion is going to persist amongst those
new to the ecosystem.

With that said, I do think that an overview of Hive Storage would be
a great addition to our documentation.

Brock

On Fri, Feb 21, 2014 at 1:27 AM, Lefty Leverenz leftylever...@gmail.com wrote:
This is in the Terminology
sectionhttps://cwiki.apache.org/confluence/display/Hive/StorageHandlers#StorageHandlers-Terminology
of
the Storage Handlers doc:

Storage handlers introduce a distinction between *native* and
*non-native* tables.
A native table is one which Hive knows how to manage and access without a
storage handler; a non-native table is one which requires a storage handler.

It goes on to say that non-native tables are created with a STORED BY
clause (as opposed to a STORED AS clause).

Does that clarify or muddy the waters?

-- Lefty

On Thu, Feb 20, 2014 at 7:37 PM, Lefty Leverenz
leftylever...@gmail.comwrote:

Some of these issues can be addressed in the documentation. The File
Formats section of the Language Manual needs an overview, and that might
be a good place to explain the differences between Hive-owned formats and
external formats. Or the SerDe doc could be beefed up: Built-In
SerDeshttps://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-Built-inSerDes
.

In the meantime, I've added a link to the Avro doc in the File Formats
list and mentioned Parquet in DDL's Row Format, Storage Format, and
SerDehttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDesection:

Use STORED AS PARQUET (without ROW FORMAT SERDE) for the
Parquethttps://cwiki.apache.org/confluence/display/Hive/Parquet columnar
storage format in Hive 0.13.0 and
laterhttps://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.13andlater;
or use ROW FORMAT SERDE ... STORED AS INPUTFORMAT ... OUTPUTFORMAT ... in
Hive
0.10, 0.11, or
0.12https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.10-0.12
.

Does that work?

-- Lefty

On Tue, Feb 18, 2014 at 1:31 PM, Brock Noland br...@cloudera.com wrote:

Hi Alan,

Response is inline, below:

On Tue, Feb 18, 2014 at 11:49 AM, Alan Gates ga...@hortonworks.com
wrote:
Gunther, is it the case that there is anything extra that needs to be
done to ship Parquet code with Hive right now? If I read the patch
correctly the Parquet jars were added to the pom and thus will be shipped
as part of Hive. As long as it works out of the box when a user says
create table ... stored as parquet why do we care whether the parquet jar
is owned by Hive or another project?

The concern about feature mismatch in Parquet versus Hive is valid, but
I'm not sure what to do about it other than assure that there are good
error messages. Users will often want to use non-Hive based storage
formats (Parquet, Avro, etc.). This means we need a good way to detect at
SQL compile time that the underlying storage doesn't support the indicated
data type and throw a good error.

Agreed, the error messages should absolutely be good. I will ensure
this is the case via https://issues.apache.org/jira/browse/HIVE-6457

Also, it's important to be clear going forward about what Hive as a
project is signing up for. If tomorrow someone decides to add a new
datatype or feature we need to be clear that we expect the contributor to
make this work for Hive owned formats (text, RC, sequence, ORC) but not
necessarily for external formats

This makes sense to me.

I'd just like to add that I have a patch available to improve the
hive-exec uber jar and general query speed:
https://issues.apache.org/jira/browse/HIVE-860. Additionally I have a
patch available to finish the generic STORED AS functionality:
https://issues.apache.org/jira/browse/HIVE-5976

Brock

--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

[jira] [Commented] (HIVE-6467) metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar


[ 
https://issues.apache.org/jira/browse/HIVE-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908369#comment-13908369
 ] 

Hive QA commented on HIVE-6467:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629932/HIVE-6467.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5141 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1433/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1433/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629932

 metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar
 --

 Key: HIVE-6467
 URL: https://issues.apache.org/jira/browse/HIVE-6467
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6467.1.patch


 Trying to tinker with the metastore upgrade scripts and did the following 
 steps on a brand new Derby DB:
 From derby:
 {noformat}
 run 'hive-schema-0.12.0.derby.sql';
 run 'upgrade-0.12.0-to-0.13.0.derby.sql';
 {noformat}
 From Hive:
 {noformat}
 show tables;
 {noformat}
 I then hit the following error below.  It appears that in the metastore DBS 
 table, the row with defaultdb was created with the value ROLE  , with 
 spaces at the end, where it was expecting ROLE.
 {noformat}
 2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: 
 No enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE   

   at java.lang.Enum.valueOf(Enum.java:196)
   at 
 org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
   at com.sun.proxy.$Proxy7.getDatabase(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at com.sun.proxy.$Proxy8.get_database(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at com.sun.proxy.$Proxy9.getDatabase(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139)
   at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at

Precommit queue

There was a ec2 spot price spike overnight which combined with
everyone trying to get patches in for the branching has resulted in a
massive queue:

http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

~25 builds in the queue

Brock

[jira] [Updated] (HIVE-6467) metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar

2014-02-21 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6467:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jason!

 metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar
 --

 Key: HIVE-6467
 URL: https://issues.apache.org/jira/browse/HIVE-6467
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.13.0

 Attachments: HIVE-6467.1.patch


 Trying to tinker with the metastore upgrade scripts and did the following 
 steps on a brand new Derby DB:
 From derby:
 {noformat}
 run 'hive-schema-0.12.0.derby.sql';
 run 'upgrade-0.12.0-to-0.13.0.derby.sql';
 {noformat}
 From Hive:
 {noformat}
 show tables;
 {noformat}
 I then hit the following error below.  It appears that in the metastore DBS 
 table, the row with defaultdb was created with the value ROLE  , with 
 spaces at the end, where it was expecting ROLE.
 {noformat}
 2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: 
 No enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE   

   at java.lang.Enum.valueOf(Enum.java:196)
   at 
 org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
   at com.sun.proxy.$Proxy7.getDatabase(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at com.sun.proxy.$Proxy8.get_database(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at com.sun.proxy.$Proxy9.getDatabase(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150)
   at 
 org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139)
   at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1566)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1339)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1010)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1000)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at

[jira] [Commented] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried


[ 
https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908566#comment-13908566
 ] 

Hive QA commented on HIVE-6464:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629799/HIVE-6464.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5169 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1435/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1435/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629799

 Test configuration: reduce the duration for which lock attempts are retried
 ---

 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6464.1.patch


 Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
 tests attempt to disable locking but sometimes don't do it correctly and 
 changes can cause the locking to kick in. Locking fails, (at least in the HS2 
 related tests) because of problems in creating the zookeeper entries in test 
 mode. When locking attempt kicks in and that fails, it can end up waiting for 
 6000 seconds before failing.
 As the tests are not trying to test parallel locking, there is no reason to 
 wait this long in the tests. 
 We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases

2014-02-21 Thread Sergey Shelukhin



 On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 177
  https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line177
 
  this seems ugly to me. can't you delegate to the hashtable loader to 
  decide which class to use?

this is in the operator, hashtable is already loaded so there's no loader, and 
key is already decided.
We want to ensure that 
1) We use the same key as that table.
2) We don't make decision for each key separately rechecking again and again... 
but to decide based on previous key we need previous key.


 On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 206
  https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line206
 
  the old code complies with the coding guidelines... can you change back?

there are coding guidelines? :)


 On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java, 
  line 104
  https://reviews.apache.org/r/18230/diff/3/?file=499107#file499107line104
 
  what causes the warning?

cast to List


 On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBase.java, 
  line 98
  https://reviews.apache.org/r/18230/diff/3/?file=499108#file499108line98
 
  looking for subclasses in the base is um not so nice. can't you avoid 
  this static stuff? just make read a member and override the approriate 
  stuff. you're already passing in a ref object, so why not call read on that?

these methods actually create a class. Ref can be null (for the first key when 
we determine what to use), and it can be non-reusable (e.g.  when loading 
hashtable because it's the key from table).
Should static stuff be moved to a separate factory class?


 On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java,
   line 64
  https://reviews.apache.org/r/18230/diff/3/?file=499109#file499109line64
 
  this and the next field (vectorized) is going to be neigh impossible to 
  maintain. you really care about fixed length here is that it? we should add 
  this to objectinstpectorutils or something like that. i think there's 
  already some code in hive that returns size of datatype to you.

These sizes are actually the ones DataOutput implementation below writes. The 
thing I care about is that serialized keys are comparable.
Why would they be impossible to maintain? 


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18230/#review35132
---


On Feb. 20, 2014, 7:46 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18230/
 ---
 
 (Updated Feb. 20, 2014, 7:46 p.m.)
 
 
 Review request for hive, Gunther Hagleitner and Jitendra Pandey.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
 3cfaacf 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 24f1229 
   ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
 61545b5 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
 2ac0928 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBase.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
  9ce0ae6 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
  83ba0f0 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
  581046e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 2466a3b 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
  c541ad2 
   ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java 
 a103a51 
   ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 
 2cb1ac3 
 
 Diff: https://reviews.apache.org/r/18230/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin

Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases

2014-02-21 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18230/
---

(Updated Feb. 21, 2014, 6:40 p.m.)


Review request for hive, Gunther Hagleitner and Jitendra Pandey.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 24f1229 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
61545b5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
2ac0928 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyObject.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 9ce0ae6 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 83ba0f0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
 581046e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
2466a3b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 
997202f 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
 c541ad2 
  ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java 
a103a51 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
 61c5741 
  ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 2cb1ac3 

Diff: https://reviews.apache.org/r/18230/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Updated] (HIVE-6429) MapJoinKey has large memory overhead in typical cases

2014-02-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6429:
---

Attachment: HIVE-6429.03.patch

 MapJoinKey has large memory overhead in typical cases
 -

 Key: HIVE-6429
 URL: https://issues.apache.org/jira/browse/HIVE-6429
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, 
 HIVE-6429.03.patch, HIVE-6429.WIP.patch, HIVE-6429.patch


 The only thing that MJK really needs it hashCode and equals (well, and 
 construction), so there's no need to have array of writables in there. 
 Assuming all the keys for a table have the same structure, for the common 
 case where keys are primitive types, we can store something like a byte array 
 combination of keys to reduce the memory usage. Will probably speed up 
 compares too.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Timeline for the Hive 0.13 release?

Can we wait for some few more days for the branching ? I have a few more
security fixes that I would like to get in, and we also have a long
pre-commit queue ahead right now. How about branching around Friday next
week ?  By then hadoop 2.3 should also be out as that vote has been
concluded, and we can get HIVE-6037 in as well.
-Thejas



On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote:

 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending
 tests.

 Brock

 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote:
  HIVE-6037 is for generating hive-default.template file from HiveConf.
 Could
  it be included in this release? If it's not, I'll suspend further
 rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com:
 
  I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time
 for
  the release.  It's a long and growing list, though, so no promises.
 
  Feel free to do your own documentation, or hand it off to a friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com
  wrote:
 
   Sounds good to me.
  
  
   On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
   wrote:
  
Hi,
   
Its mid feb. Wanted to check if the community is ready to cut a
 branch.
Could we cut the branch in a week , say 5pm PST 2/21/14?
The goal is to keep the release cycle short: couple of weeks; so
 after
   the
branch we go into stabilizing mode for hive 0.13, checking in only
blocker/critical bug fixes.
   
regards,
Harish.
   
   
On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
   
 Hi,

 I agree that picking a date to branch and then restricting
 commits to
that
 branch would be a less time intensive plan for the RM.

 Brock


 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
   hbut...@hortonworks.com
wrote:

 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties for
  this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.

 regards,
 Harish.

 On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com
 
wrote:

 Yes, I think it is time to start planning for the next release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
   patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.

 Thanks,
 Thejas


 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org
 
wrote:
 I was wondering what people think about setting a tentative
 date
  for
the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
   Hive
 should follow a time-based release model with new releases
 every
   four
 months. If we follow that schedule we're due for the next
 release
  in
 mid-February.

 Thoughts?

 Thanks.

 Carl

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
entity
 to
 which it is addressed and may contain information that is
   confidential,
 privileged and exempt from disclosure under applicable law. If
 the
reader
 of this message is not the intended recipient, you are hereby
   notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of this communication is strictly prohibited. If you
  have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
entity to
 which it is addressed and may contain information that is
   confidential,
 privileged and exempt from disclosure under applicable law. If
 the
reader
 of this message is not the intended recipient, you are hereby
  notified
that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
immediately
 and delete it from your system. Thank You.




 --
 Apache MRUnit - Unit testing MapReduce -

Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18291/
---

(Updated Feb. 21, 2014, 7:05 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-6466
https://issues.apache.org/jira/browse/HIVE-6466


Repository: hive-git


Description
---

Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  pom.xml 9aef665 
  service/pom.xml b1002e2 
  
service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java
 b92fd83 
  
service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/18291/diff/


Testing
---


Thanks,

Vaibhav Gumashta

Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18291/#review35180
---



service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
https://reviews.apache.org/r/18291/#comment65606

Done. Thanks for taking a look.


- Vaibhav Gumashta


On Feb. 19, 2014, 11:44 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18291/
 ---
 
 (Updated Feb. 19, 2014, 11:44 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-6466
 https://issues.apache.org/jira/browse/HIVE-6466
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
   pom.xml 9aef665 
   service/pom.xml b1002e2 
   
 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java
  b92fd83 
   
 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18291/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

Re: Timeline for the Hive 0.13 release?

+1

On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote:
 Can we wait for some few more days for the branching ? I have a few more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas



 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote:

 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending
 tests.

 Brock

 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote:
  HIVE-6037 is for generating hive-default.template file from HiveConf.
 Could
  it be included in this release? If it's not, I'll suspend further
 rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com:
 
  I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time
 for
  the release.  It's a long and growing list, though, so no promises.
 
  Feel free to do your own documentation, or hand it off to a friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com
  wrote:
 
   Sounds good to me.
  
  
   On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
   wrote:
  
Hi,
   
Its mid feb. Wanted to check if the community is ready to cut a
 branch.
Could we cut the branch in a week , say 5pm PST 2/21/14?
The goal is to keep the release cycle short: couple of weeks; so
 after
   the
branch we go into stabilizing mode for hive 0.13, checking in only
blocker/critical bug fixes.
   
regards,
Harish.
   
   
On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
   
 Hi,

 I agree that picking a date to branch and then restricting
 commits to
that
 branch would be a less time intensive plan for the RM.

 Brock


 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
   hbut...@hortonworks.com
wrote:

 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties for
  this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.

 regards,
 Harish.

 On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com
 
wrote:

 Yes, I think it is time to start planning for the next release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
   patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.

 Thanks,
 Thejas


 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org
 
wrote:
 I was wondering what people think about setting a tentative
 date
  for
the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
   Hive
 should follow a time-based release model with new releases
 every
   four
 months. If we follow that schedule we're due for the next
 release
  in
 mid-February.

 Thoughts?

 Thanks.

 Carl

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
entity
 to
 which it is addressed and may contain information that is
   confidential,
 privileged and exempt from disclosure under applicable law. If
 the
reader
 of this message is not the intended recipient, you are hereby
   notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of this communication is strictly prohibited. If you
  have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
entity to
 which it is addressed and may contain information that is
   confidential,
 privileged and exempt from disclosure under applicable law. If
 the
reader
 of this message is not the intended recipient, you are hereby
  notified
that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
immediately
 and delete it from your system. Thank You.

Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18291/
---

(Updated Feb. 21, 2014, 7:06 p.m.)


Review request for hive, Mohammad Islam and Thejas Nair.


Bugs: HIVE-6466
https://issues.apache.org/jira/browse/HIVE-6466


Repository: hive-git


Description
---

Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  pom.xml 9aef665 
  service/pom.xml b1002e2 
  
service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java
 b92fd83 
  
service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/18291/diff/


Testing
---


Thanks,

Vaibhav Gumashta

[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6466:
---

Attachment: HIVE-6466.2.patch

 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6466:
---

Status: Patch Available  (was: Open)

 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive


[ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908694#comment-13908694
 ] 

Xuefu Zhang commented on HIVE-6466:
---

Thanks for the explanation, [~vaibhavgumashta].

 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18291/#review35183
---

Ship it!


Ship It!

- Thejas Nair


On Feb. 21, 2014, 7:06 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18291/
 ---
 
 (Updated Feb. 21, 2014, 7:06 p.m.)
 
 
 Review request for hive, Mohammad Islam and Thejas Nair.
 
 
 Bugs: HIVE-6466
 https://issues.apache.org/jira/browse/HIVE-6466
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
   pom.xml 9aef665 
   service/pom.xml b1002e2 
   
 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java
  b92fd83 
   
 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18291/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Commented] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive


[ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908703#comment-13908703
 ] 

Thejas M Nair commented on HIVE-6466:
-

+1 
Please include documentation in release notes.


 Add support for pluggable authentication modules (PAM) in Hive
 --

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases

2014-02-21 Thread Gunther Hagleitner

On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java,
line 64
https://reviews.apache.org/r/18230/diff/3/?file=499109#file499109line64

this and the next field (vectorized) is going to be neigh impossible to
maintain. you really care about fixed length here is that it? we should add
this to objectinstpectorutils or something like that. i think there's
already some code in hive that returns size of datatype to you.

Sergey Shelukhin wrote:
These sizes are actually the ones DataOutput implementation below writes.
The thing I care about is that serialized keys are comparable.
Why would they be impossible to maintain?

A different thought: Have you considered using HiveKey/BinarySortableSerDe for
this? Would this create more overhead? I think that SerDe uses vInts for the
datatypes you support and the result is binary comparable. There might be some
fixed overhead you don't want - but if we could reuse some of that code there
wouldn't be a problem of maintaining this specific key stuff.

The sizes you have in the map look like the java datatype sizes, that's why I
was suggesting using the Utils. Either way - if you could move that logic
(partly) to the proper utils there's a better chance that someone
adding/changing datatypes will catch it.

On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote:
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 177
https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line177

this seems ugly to me. can't you delegate to the hashtable loader to
decide which class to use?

Sergey Shelukhin wrote:
this is in the operator, hashtable is already loaded so there's no
loader, and key is already decided.
We want to ensure that
1) We use the same key as that table.
2) We don't make decision for each key separately rechecking again and
again... but to decide based on previous key we need previous key.

I don't understand the arguments. The hashtable loader is part of the operator,
so you can still use it, you don't have to make the decision every time, it's
up to you how you implement that in the loader and the loader knows what table
it created so it's a good place to enforce conformity, isn't it?

- Gunther

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18230/#review35132
---

On Feb. 21, 2014, 6:40 p.m., Sergey Shelukhin wrote:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18230/
---

(Updated Feb. 21, 2014, 6:40 p.m.)

Review request for hive, Gunther Hagleitner and Jitendra Pandey.

Repository: hive-git

Description
---

See JIRA

Diffs
-

common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7
ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
24f1229
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java
61545b5
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
2ac0928
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java
PRE-CREATION

ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyObject.java
PRE-CREATION

ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
9ce0ae6

ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
83ba0f0
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
581046e

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java
2466a3b

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java
997202f

ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
c541ad2
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java
a103a51

ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
61c5741
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java
2cb1ac3

Diff: https://reviews.apache.org/r/18230/diff/

Testing
---

Thanks,

Sergey Shelukhin

[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2


[ 
https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908714#comment-13908714
 ] 

Thejas M Nair commented on HIVE-5155:
-

Prasad, It would be great to get this patch in for 0.13 release.
I think just the issue of proxy user config parameter needs to be addressed. ie 
having a specific config for HS2 proxy privileges so that the user does not 
have to be made a hdfs/MR wide proxy user.


 Support secure proxy user access to HiveServer2
 ---

 Key: HIVE-5155
 URL: https://issues.apache.org/jira/browse/HIVE-5155
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, HiveServer2, JDBC
Affects Versions: 0.12.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, 
 HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, 
 HIVE-5155-noThrift.6.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, 
 HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java


 The HiveServer2 can authenticate a client using via Kerberos and impersonate 
 the connecting user with underlying secure hadoop. This becomes a gateway for 
 a remote client to access secure hadoop cluster. Now this works fine for when 
 the client obtains Kerberos ticket and directly connects to HiveServer2. 
 There's another big use case for middleware tools where the end user wants to 
 access Hive via another server. For example Oozie action or Hue submitting 
 queries or a BI tool server accessing to HiveServer2. In these cases, the 
 third party server doesn't have end user's Kerberos credentials and hence it 
 can't submit queries to HiveServer2 on behalf of the end user.
 This ticket is for enabling proxy access to HiveServer2 for third party tools 
 on behalf of end users. There are two parts of the solution proposed in this 
 ticket:
 1) Delegation token based connection for Oozie (OOZIE-1457)
 This is the common mechanism for Hadoop ecosystem components. Hive Remote 
 Metastore and HCatalog already support this. This is suitable for tool like 
 Oozie that submits the MR jobs as actions on behalf of its client. Oozie 
 already uses similar mechanism for Metastore/HCatalog access.
 2) Direct proxy access for privileged hadoop users
 The delegation token implementation can be a challenge for non-hadoop 
 (especially non-java) components. This second part enables a privileged user 
 to directly specify an alternate session user during the connection. If the 
 connecting user has hadoop level privilege to impersonate the requested 
 userid, then HiveServer2 will run the session as that requested user. For 
 example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy 
 user configuration). Then user Hue can connect to HiveServer2 and specify Bob 
 as session user via a session property. HiveServer2 will verify Hue's proxy 
 user privilege and then impersonate user Bob instead of Hue. This will enable 
 any third party tool to impersonate alternate userid without having to 
 implement delegation token connection.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows


[ 
https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908737#comment-13908737
 ] 

Jason Dere commented on HIVE-5176:
--

Taking a closer look a this patch, it's a mix of several patches done for 
Windows work.  I'm going to try to split this into smaller patches with each 
specific change.

 Wincompat : Changes for allowing various path compatibilities with Windows
 --

 Key: HIVE-5176
 URL: https://issues.apache.org/jira/browse/HIVE-5176
 Project: Hive
  Issue Type: Sub-task
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5176.2.patch, HIVE-5176.patch


 We need to make certain changes across the board to allow us to read/parse 
 windows paths. Some are escaping changes, some are being strict about how we 
 read paths (through URL.encode/decode, etc)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6480) Metastore server startup script ignores ENV settings

Adam Faris created HIVE-6480:


 Summary: Metastore server startup script ignores ENV settings
 Key: HIVE-6480
 URL: https://issues.apache.org/jira/browse/HIVE-6480
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Adam Faris
Priority: Minor


This is a minor issue with hcat_server.sh.  Currently the startup script has 
HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the 
script.  As hcat_server.sh reads hcat-env.sh, it makes sense to allow an 
administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like 
/etc/profile).

Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh.  If 
METASTORE_PORT is missing, the metastore server fails to start.

I will attach a patch in my next update, once this jira is opened.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings


 [ 
https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Faris updated HIVE-6480:
-

Attachment: HIVE-6480.01.patch

Attaching 'git diff' output against trunk.

 Metastore server startup script ignores ENV settings
 

 Key: HIVE-6480
 URL: https://issues.apache.org/jira/browse/HIVE-6480
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Adam Faris
Priority: Minor
 Attachments: HIVE-6480.01.patch


 This is a minor issue with hcat_server.sh.  Currently the startup script has 
 HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the 
 script.  As hcat_server.sh reads hcat-env.sh, it makes sense to allow an 
 administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location 
 like /etc/profile).
 Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh.  
 If METASTORE_PORT is missing, the metastore server fails to start.
 I will attach a patch in my next update, once this jira is opened.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings


 [ 
https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Faris updated HIVE-6480:
-

Status: Patch Available  (was: Open)

 Metastore server startup script ignores ENV settings
 

 Key: HIVE-6480
 URL: https://issues.apache.org/jira/browse/HIVE-6480
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Adam Faris
Priority: Minor
 Attachments: HIVE-6480.01.patch


 This is a minor issue with hcat_server.sh.  Currently the startup script has 
 HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the 
 script.  As hcat_server.sh reads hcat-env.sh, it makes sense to allow an 
 administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location 
 like /etc/profile).
 Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh.  
 If METASTORE_PORT is missing, the metastore server fails to start.
 I will attach a patch in my next update, once this jira is opened.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Timeline for the Hive 0.13 release?

2014-02-21 Thread Harish Butani

Yes makes sense.
How about we postpone the branching until 10am PST March 3rd, which is the 
following Monday.
Don’t see a point of setting the branch time to a Friday evening.
Do people agree?

regards,
Harish.

On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:

 +1
 
 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote:
 Can we wait for some few more days for the branching ? I have a few more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas
 
 
 
 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote:
 
 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending
 tests.
 
 Brock
 
 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote:
 HIVE-6037 is for generating hive-default.template file from HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend further
 rebasing
 of it till next release (conflicts too frequently).
 
 
 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com:
 
 I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time
 for
 the release.  It's a long and growing list, though, so no promises.
 
 Feel free to do your own documentation, or hand it off to a friendly
 in-house writer.
 
 -- Lefty, self-appointed Hive docs maven
 
 
 
 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Sounds good to me.
 
 
 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Hi,
 
 Its mid feb. Wanted to check if the community is ready to cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in only
 blocker/critical bug fixes.
 
 regards,
 Harish.
 
 
 On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
 
 Hi,
 
 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.
 
 Brock
 
 
 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.
 
 regards,
 Harish.
 
 On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com
 
 wrote:
 
 Yes, I think it is time to start planning for the next release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
 patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.
 
 Thanks,
 Thejas
 
 
 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org
 
 wrote:
 I was wondering what people think about setting a tentative
 date
 for
 the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
 Hive
 should follow a time-based release model with new releases
 every
 four
 months. If we follow that schedule we're due for the next
 release
 in
 mid-February.
 
 Thoughts?
 
 Thanks.
 
 Carl
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 
 
 --
 Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
 
 
 --

Re: Timeline for the Hive 0.13 release?

Might as well make it March 4th or 5th. Otherwise folks will burn
weekend time to get patches in.

On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote:
 Yes makes sense.
 How about we postpone the branching until 10am PST March 3rd, which is the 
 following Monday.
 Don’t see a point of setting the branch time to a Friday evening.
 Do people agree?

 regards,
 Harish.

 On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:

 +1

 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote:
 Can we wait for some few more days for the branching ? I have a few more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas



 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote:

 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending
 tests.

 Brock

 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote:
 HIVE-6037 is for generating hive-default.template file from HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend further
 rebasing
 of it till next release (conflicts too frequently).


 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com:

 I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time
 for
 the release.  It's a long and growing list, though, so no promises.

 Feel free to do your own documentation, or hand it off to a friendly
 in-house writer.

 -- Lefty, self-appointed Hive docs maven



 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com
 wrote:

 Sounds good to me.


 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:

 Hi,

 Its mid feb. Wanted to check if the community is ready to cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in only
 blocker/critical bug fixes.

 regards,
 Harish.


 On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:

 Hi,

 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.

 Brock


 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:

 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.

 regards,
 Harish.

 On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com

 wrote:

 Yes, I think it is time to start planning for the next release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
 patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.

 Thanks,
 Thejas


 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org

 wrote:
 I was wondering what people think about setting a tentative
 date
 for
 the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
 Hive
 should follow a time-based release model with new releases
 every
 four
 months. If we follow that schedule we're due for the next
 release
 in
 mid-February.

 Thoughts?

 Thanks.

 Carl

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.


 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender

[jira] [Commented] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958


[ 
https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908800#comment-13908800
 ] 

Thejas M Nair commented on HIVE-6479:
-

+1 . I don't think we need to wait for the full unit test suite to kick in or 
for 24hours for this one, as it just updates 2 .q files. I will commit it after 
verifying that these two tests pass with this change.


 Few .q.out files need to be updated post HIVE-5958
 --

 Key: HIVE-6479
 URL: https://issues.apache.org/jira/browse/HIVE-6479
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6479.patch


 See my comment 
 https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization


[ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908801#comment-13908801
 ] 

Hive QA commented on HIVE-6455:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630239/HIVE-6455.6.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 5170 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1436/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1436/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630239

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, 
 HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, 
 HIVE-6455.6.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6480) Metastore server startup script ignores ENV settings


[ 
https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908807#comment-13908807
 ] 

Adam Faris commented on HIVE-6480:
--

Reviewboard link https://reviews.apache.org/r/18373/

 Metastore server startup script ignores ENV settings
 

 Key: HIVE-6480
 URL: https://issues.apache.org/jira/browse/HIVE-6480
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Adam Faris
Priority: Minor
 Attachments: HIVE-6480.01.patch


 This is a minor issue with hcat_server.sh.  Currently the startup script has 
 HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the 
 script.  As hcat_server.sh reads hcat-env.sh, it makes sense to allow an 
 administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location 
 like /etc/profile).
 Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh.  
 If METASTORE_PORT is missing, the metastore server fails to start.
 I will attach a patch in my next update, once this jira is opened.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958


[ 
https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908806#comment-13908806
 ] 

Thejas M Nair edited comment on HIVE-6479 at 2/21/14 8:59 PM:
--

Verified that the 2 tests pass with this .q.out file update . I am planning to 
commit this in another 1/2 hour. It will get rid of false alarms for the 
pending precommit tests.



was (Author: thejas):
Verified that the 2 tests pass with this .q.out file update . I am planning to 
commit this in another 1/2 hour.


 Few .q.out files need to be updated post HIVE-5958
 --

 Key: HIVE-6479
 URL: https://issues.apache.org/jira/browse/HIVE-6479
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6479.patch


 See my comment 
 https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958


[ 
https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908806#comment-13908806
 ] 

Thejas M Nair commented on HIVE-6479:
-

Verified that the 2 tests pass with this .q.out file update . I am planning to 
commit this in another 1/2 hour.


 Few .q.out files need to be updated post HIVE-5958
 --

 Key: HIVE-6479
 URL: https://issues.apache.org/jira/browse/HIVE-6479
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6479.patch


 See my comment 
 https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried


 [ 
https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6464:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

The two test failures are unrelated. See HIVE-6479 .

Patch committed to trunk. Thanks for the review [~navis]!


 Test configuration: reduce the duration for which lock attempts are retried
 ---

 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6464.1.patch


 Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
 tests attempt to disable locking but sometimes don't do it correctly and 
 changes can cause the locking to kick in. Locking fails, (at least in the HS2 
 related tests) because of problems in creating the zookeeper entries in test 
 mode. When locking attempt kicks in and that fails, it can end up waiting for 
 6000 seconds before failing.
 As the tests are not trying to test parallel locking, there is no reason to 
 wait this long in the tests. 
 We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried


 [ 
https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6464:


Fix Version/s: 0.13.0

 Test configuration: reduce the duration for which lock attempts are retried
 ---

 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6464.1.patch


 Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
 tests attempt to disable locking but sometimes don't do it correctly and 
 changes can cause the locking to kick in. Locking fails, (at least in the HS2 
 related tests) because of problems in creating the zookeeper entries in test 
 mode. When locking attempt kicks in and that fails, it can end up waiting for 
 6000 seconds before failing.
 As the tests are not trying to test parallel locking, there is no reason to 
 wait this long in the tests. 
 We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Timeline for the Hive 0.13 release?

2014-02-21 Thread Harish Butani

Ok,let’s set it for March 4th .

regards,
Harish.

On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote:

 Might as well make it March 4th or 5th. Otherwise folks will burn
 weekend time to get patches in.
 
 On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com 
 wrote:
 Yes makes sense.
 How about we postpone the branching until 10am PST March 3rd, which is the 
 following Monday.
 Don’t see a point of setting the branch time to a Friday evening.
 Do people agree?
 
 regards,
 Harish.
 
 On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:
 
 +1
 
 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote:
 Can we wait for some few more days for the branching ? I have a few more
 security fixes that I would like to get in, and we also have a long
 pre-commit queue ahead right now. How about branching around Friday next
 week ?  By then hadoop 2.3 should also be out as that vote has been
 concluded, and we can get HIVE-6037 in as well.
 -Thejas
 
 
 
 On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote:
 
 I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending
 tests.
 
 Brock
 
 On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote:
 HIVE-6037 is for generating hive-default.template file from HiveConf.
 Could
 it be included in this release? If it's not, I'll suspend further
 rebasing
 of it till next release (conflicts too frequently).
 
 
 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com:
 
 I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time
 for
 the release.  It's a long and growing list, though, so no promises.
 
 Feel free to do your own documentation, or hand it off to a friendly
 in-house writer.
 
 -- Lefty, self-appointed Hive docs maven
 
 
 
 On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com
 wrote:
 
 Sounds good to me.
 
 
 On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Hi,
 
 Its mid feb. Wanted to check if the community is ready to cut a
 branch.
 Could we cut the branch in a week , say 5pm PST 2/21/14?
 The goal is to keep the release cycle short: couple of weeks; so
 after
 the
 branch we go into stabilizing mode for hive 0.13, checking in only
 blocker/critical bug fixes.
 
 regards,
 Harish.
 
 
 On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
 wrote:
 
 Hi,
 
 I agree that picking a date to branch and then restricting
 commits to
 that
 branch would be a less time intensive plan for the RM.
 
 Brock
 
 
 On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
 hbut...@hortonworks.com
 wrote:
 
 Yes agree it is time to start planning for the next release.
 I would like to volunteer to do the release management duties for
 this
 release(will be a great experience for me)
 Will be happy to do it, if the community is fine with this.
 
 regards,
 Harish.
 
 On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com
 
 wrote:
 
 Yes, I think it is time to start planning for the next release.
 For 0.12 release I created a branch and then accepted patches
 that
 people asked to be included for sometime, before moving a phase
 of
 accepting only critical bug fixes. This turned out to be
 laborious.
 I think we should instead give everyone a few weeks to get any
 patches
 they are working on to be ready, cut the branch, and take in
 only
 critical bug fixes to the branch after that.
 How about cutting the branch around mid-February and targeting
 to
 release in a week or two after that.
 
 Thanks,
 Thejas
 
 
 On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org
 
 wrote:
 I was wondering what people think about setting a tentative
 date
 for
 the
 Hive 0.13 release? At an old Hive Contrib meeting we agreed
 that
 Hive
 should follow a time-based release model with new releases
 every
 four
 months. If we follow that schedule we're due for the next
 release
 in
 mid-February.
 
 Thoughts?
 
 Thanks.
 
 Carl
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual
 or
 entity
 to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination, distribution, disclosure
 or
 forwarding of this communication is strictly prohibited. If you
 have
 received this communication in error, please contact the sender
 immediately
 and delete it from your system. Thank You.
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or
 entity to
 which it is addressed and may contain information that is
 confidential,
 privileged and exempt from disclosure under applicable law. If
 the
 reader
 of this message is not the intended recipient, you are hereby
 notified
 that
 any printing, copying, dissemination,

[jira] [Created] (HIVE-6481) Add .reviewboardrc file

2014-02-21 Thread Carl Steinbach (JIRA)

Carl Steinbach created HIVE-6481:


 Summary: Add .reviewboardrc file
 Key: HIVE-6481
 URL: https://issues.apache.org/jira/browse/HIVE-6481
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach


We should add a .reviewboardrc file to trunk in order to streamline the review 
process.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6482) Fix NOTICE file: pre release task

2014-02-21 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6482:


Attachment: HIVE-6482.1.patch

 Fix NOTICE file: pre release task
 -

 Key: HIVE-6482
 URL: https://issues.apache.org/jira/browse/HIVE-6482
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
Priority: Trivial
 Attachments: HIVE-6482.1.patch


 As per steps in Release doc: 
 https://cwiki.apache.org/confluence/display/Hive/HowToRelease
 Removed projects with Apache license as per [~thejas] suggestion.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6482) Fix NOTICE file: pre release task


[ 
https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908884#comment-13908884
 ] 

Thejas M Nair commented on HIVE-6482:
-

All these libraries except for jersey are under MIT/BSD license or derivatives. 
Only one that I am not sure of is jsersey library license (CDDL) its too long 
to read. Might as well put it in the NOTICE section to be safe.

An interesting note : JSON license also adds that The Software shall be used 
for Good, not Evil !! :)


 Fix NOTICE file: pre release task
 -

 Key: HIVE-6482
 URL: https://issues.apache.org/jira/browse/HIVE-6482
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
Priority: Trivial
 Attachments: HIVE-6482.1.patch


 As per steps in Release doc: 
 https://cwiki.apache.org/confluence/display/Hive/HowToRelease
 Removed projects with Apache license as per [~thejas] suggestion.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6482) Fix NOTICE file: pre release task


[ 
https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908887#comment-13908887
 ] 

Thejas M Nair commented on HIVE-6482:
-

I mean to say that we can remove all the notices from the NOTICE file except 
for maybe jersey. If someone can verify that it also does not have such an 
attribution required, we can remove that from NOTICE file as well.

cc [~cwsteinbach]

 Fix NOTICE file: pre release task
 -

 Key: HIVE-6482
 URL: https://issues.apache.org/jira/browse/HIVE-6482
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
Priority: Trivial
 Attachments: HIVE-6482.1.patch


 As per steps in Release doc: 
 https://cwiki.apache.org/confluence/display/Hive/HowToRelease
 Removed projects with Apache license as per [~thejas] suggestion.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-02-21 Thread Abin Shahab (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated HIVE-6468:
--

Description: 
We see an out of memory error when we run simple beeline calls.
(The hive.server2.transport.mode is binary)

curl localhost:1

Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
space
at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
at 
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

  was:
We see an out of memory error when we run simple beeline calls.
(The hive.server2.transport.mode is binary)
beeline -u jdbc:hive2://localhost:1 -n user1 -d 
org.apache.hive.jdbc.HiveDriver -e create table test1 (id) int;

Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
space
at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
at 
org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Summary: HS2 out of memory error when curl sends a get request  (was: 
HS2 out of memory error with Beeline)

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab

 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6461) Run Release Audit tool, fix missing license issues


 [ 
https://issues.apache.org/jira/browse/HIVE-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6461:


   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk.
Thanks Harish!


 Run Release Audit tool, fix missing license issues
 --

 Key: HIVE-6461
 URL: https://issues.apache.org/jira/browse/HIVE-6461
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
Priority: Trivial
 Fix For: 0.13.0

 Attachments: HIVE-6461.1.patch


 run mvn apache-rat:check and add apache license in flagged files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type


 [ 
https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-3635:
--

Status: Patch Available  (was: Open)

  allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for 
 the boolean hive type
 ---

 Key: HIVE-3635
 URL: https://issues.apache.org/jira/browse/HIVE-3635
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.9.0
Reporter: Alexander Alten-Lorenz
Assignee: Xuefu Zhang
 Attachments: HIVE-3635.1.patch, HIVE-3635.patch


 interpret t as true and f as false for boolean types. PostgreSQL exports 
 represent it that way.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type


 [ 
https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-3635:
--

Attachment: HIVE-3635.1.patch

Patch #1 is based on #0, but provides configuration parameter and test.

  allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for 
 the boolean hive type
 ---

 Key: HIVE-3635
 URL: https://issues.apache.org/jira/browse/HIVE-3635
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.9.0
Reporter: Alexander Alten-Lorenz
Assignee: Xuefu Zhang
 Attachments: HIVE-3635.1.patch, HIVE-3635.patch


 interpret t as true and f as false for boolean types. PostgreSQL exports 
 represent it that way.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6483) Hive on Tez - Hive should create different payloads for inputs and outputs

2014-02-21 Thread Bikas Saha (JIRA)

Bikas Saha created HIVE-6483:


 Summary: Hive on Tez - Hive should create different payloads for 
inputs and outputs
 Key: HIVE-6483
 URL: https://issues.apache.org/jira/browse/HIVE-6483
 Project: Hive
  Issue Type: Bug
Reporter: Bikas Saha


Currently, Hive creates a single vertex payload that is implicitly shared with 
Inputs and Outputs. This creates confusion in the Tez API and configuration. 
Tracked by TEZ-696 and TEZ-872 which are blocked by this jira.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18377: HIVE-6481. Add .reviewboardrc file

2014-02-21 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18377/#review35201
---


Looks good. Minor question: do we need to put Apache license header for this 
new file?

- Xuefu Zhang


On Feb. 21, 2014, 9:36 p.m., Carl Steinbach wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18377/
 ---
 
 (Updated Feb. 21, 2014, 9:36 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-6481
 https://issues.apache.org/jira/browse/HIVE-6481
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-6481. Add .reviewboardrc file
 
 
 Diffs
 -
 
   .reviewboardrc PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18377/diff/
 
 
 Testing
 ---
 
 I was able to post this review request with the command rbt post after 
 committing my changes locally.
 
 
 Thanks,
 
 Carl Steinbach

[jira] [Updated] (HIVE-5950) ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes

2014-02-21 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5950:
-

Attachment: HIVE-5950.4.patch

Refreshed the patch to trunk.

 ORC SARG creation fails with NPE for predicate conditions with 
 decimal/date/char/varchar datatypes
 --

 Key: HIVE-5950
 URL: https://issues.apache.org/jira/browse/HIVE-5950
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-5950.1.patch, HIVE-5950.2.patch, HIVE-5950.3.patch, 
 HIVE-5950.4.patch


 When decimal or date column is used, the type field in PredicateLeafImpl will 
 be set to null. This will result in NPE during predicate leaf generation 
 because of null dereferencing in hashcode computation. SARG creation should 
 be extended to support/handle decimal and date data types.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18185: Support Kerberos HTTP authentication for HiveServer2 running in http mode


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18185/#review35196
---



jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65649

I believe using the hadoop classes here will require hadoop-common*jar also 
to be copied to jdbc client machine. We should move the use of these classes to 
a different class, that would get used only when jdbc+kerberos is used.




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65650

the flow will be clearer with an 
else-if instead of if.




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65652

If useSsl == true, I think we should throw an exception (with 
http-kerberos) with appropriate error message. Otherwise people would have a 
false sense of security of ssl being used.




jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
https://reviews.apache.org/r/18185/#comment65653

can you get rid of these trailing white spaces



- Thejas Nair


On Feb. 17, 2014, 9:24 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18185/
 ---
 
 (Updated Feb. 17, 2014, 9:24 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-4764
 https://issues.apache.org/jira/browse/HIVE-4764
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 13fc19b 
   jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 66eba1b 
   jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
   service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java
  PRE-CREATION 
   service/src/java/org/apache/hive/service/cli/CLIService.java 56b357a 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
  6fbc847 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 a6ff6ce 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 e77f043 
 
 Diff: https://reviews.apache.org/r/18185/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

Review Request 18382: HIVE-3635: allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type

2014-02-21 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18382/
---

Review request for hive.


Bugs: HIVE-3635
https://issues.apache.org/jira/browse/HIVE-3635


Repository: hive-git


Description
---

1. Implemented the functionality, allowing LazyBoolean to accepts these 
literals.
2. Added a configuration to control the functionality. Off by default.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 237b669 
  conf/hive-default.xml.template f7f50e3 
  ql/src/test/queries/clientpositive/bool_literal.q PRE-CREATION 
  ql/src/test/results/clientpositive/bool_literal.q.out PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java c741c3a 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 66f79ed 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
606208c 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBooleanObjectInspector.java
 2cf7362 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java
 5f64697 

Diff: https://reviews.apache.org/r/18382/diff/


Testing
---

Added .q test which tests the behavior in cases of whether the functionality is 
turned on or not.


Thanks,

Xuefu Zhang

Re: Review Request 18382: HIVE-3635: allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type

2014-02-21 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18382/
---

(Updated Feb. 21, 2014, 11:09 p.m.)


Review request for hive.


Bugs: HIVE-3635
https://issues.apache.org/jira/browse/HIVE-3635


Repository: hive-git


Description
---

1. Implemented the functionality, allowing LazyBoolean to accepts these 
literals.
2. Added a configuration to control the functionality. Off by default.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 237b669 
  conf/hive-default.xml.template f7f50e3 
  data/files/bool_literal.txt PRE-CREATION 
  ql/src/test/queries/clientpositive/bool_literal.q PRE-CREATION 
  ql/src/test/results/clientpositive/bool_literal.q.out PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java c741c3a 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 66f79ed 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
606208c 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBooleanObjectInspector.java
 2cf7362 
  
serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java
 5f64697 

Diff: https://reviews.apache.org/r/18382/diff/


Testing
---

Added .q test which tests the behavior in cases of whether the functionality is 
turned on or not.


Thanks,

Xuefu Zhang

[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type


 [ 
https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-3635:
--

Attachment: HIVE-3635.2.patch

Patch #2 removed trailing spaces and added missing data file.

  allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for 
 the boolean hive type
 ---

 Key: HIVE-3635
 URL: https://issues.apache.org/jira/browse/HIVE-3635
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.9.0
Reporter: Alexander Alten-Lorenz
Assignee: Xuefu Zhang
 Attachments: HIVE-3635.1.patch, HIVE-3635.2.patch, HIVE-3635.patch


 interpret t as true and f as false for boolean types. PostgreSQL exports 
 represent it that way.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs


[ 
https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909018#comment-13909018
 ] 

Hive QA commented on HIVE-6380:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12630225/HIVE-6380.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5175 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1437/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1437/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12630225

 Specify jars/files when creating permanent UDFs
 ---

 Key: HIVE-6380
 URL: https://issues.apache.org/jira/browse/HIVE-6380
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, 
 HIVE-6380.4.patch


 Need a way for a permanent UDF to reference jars/files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs


[ 
https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909026#comment-13909026
 ] 

Jason Dere commented on HIVE-6380:
--

It looks like those 2 failures are due to HIVE-6479 and had been failing for 
the last several precommit tests.

 Specify jars/files when creating permanent UDFs
 ---

 Key: HIVE-6380
 URL: https://issues.apache.org/jira/browse/HIVE-6380
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, 
 HIVE-6380.4.patch


 Need a way for a permanent UDF to reference jars/files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6484) HiveServer2 doAs should be session aware both for secured and unsecured session implementation.

Vaibhav Gumashta created HIVE-6484:
--

 Summary: HiveServer2 doAs should be session aware both for secured 
and unsecured session implementation.
 Key: HIVE-6484
 URL: https://issues.apache.org/jira/browse/HIVE-6484
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


Currently in unsecured case, the doAs is performed by decorating 
TProcessor.process method. This has been causing cleanup issues as we end up 
creating a new clientUgi for each request rather than for each session. This 
also cleans up the code.

[~thejas] Probably you can add more if you've seen other issues related to this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6380) Specify jars/files when creating permanent UDFs

2014-02-21 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6380:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Jason!

 Specify jars/files when creating permanent UDFs
 ---

 Key: HIVE-6380
 URL: https://issues.apache.org/jira/browse/HIVE-6380
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Fix For: 0.13.0

 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, 
 HIVE-6380.4.patch


 Need a way for a permanent UDF to reference jars/files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6484) HiveServer2 doAs should be session aware both for secured and unsecured session implementation.


[ 
https://issues.apache.org/jira/browse/HIVE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909041#comment-13909041
 ] 

Thejas M Nair commented on HIVE-6484:
-

The FS instance leak described in HIVE-4501 can be fixed with this change.


 HiveServer2 doAs should be session aware both for secured and unsecured 
 session implementation.
 ---

 Key: HIVE-6484
 URL: https://issues.apache.org/jira/browse/HIVE-6484
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 Currently in unsecured case, the doAs is performed by decorating 
 TProcessor.process method. This has been causing cleanup issues as we end up 
 creating a new clientUgi for each request rather than for each session. This 
 also cleans up the code.
 [~thejas] Probably you can add more if you've seen other issues related to 
 this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6439) Introduce CBO step in Semantic Analyzer

2014-02-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909040#comment-13909040
 ] 

Ashutosh Chauhan commented on HIVE-6439:


Is plan to get CBO optimizer work in 0.13 timeframe? If no, I think we may want 
to delay this till branching 0.13 Otherwise, this will go in 0.13 release with 
a config which doesn't do anything and thus will be confusing for end users.

 Introduce CBO step in Semantic Analyzer
 ---

 Key: HIVE-6439
 URL: https://issues.apache.org/jira/browse/HIVE-6439
 Project: Hive
  Issue Type: Sub-task
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6439.1.patch, HIVE-6439.2.patch, HIVE-6439.4.patch, 
 HIVE-6439.5.patch


 This patch introduces CBO step in SemanticAnalyzer. For now the 
 CostBasedOptimizer is an empty shell. 
 The contract between SemAly and CBO is:
 - CBO step  is controlled by the 'hive.enable.cbo.flag'. 
 - When true Hive SemAly will hand CBO a Hive Operator tree (with operators 
 annotated with stats). If it can CBO will return a better plan in Hive AST 
 form.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows


 [ 
https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5176:
-

Attachment: HIVE-5176.3.patch

patch v3.  Pulled out many changes which look like they do the equivalent of 
HIVE-6343, will add a patch to that Jira.  Patch should now be pretty 
equivalent to HIVE-4448.

 Wincompat : Changes for allowing various path compatibilities with Windows
 --

 Key: HIVE-5176
 URL: https://issues.apache.org/jira/browse/HIVE-5176
 Project: Hive
  Issue Type: Sub-task
  Components: Windows
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5176.2.patch, HIVE-5176.3.patch, HIVE-5176.patch


 We need to make certain changes across the board to allow us to read/parse 
 windows paths. Some are escaping changes, some are being strict about how we 
 read paths (through URL.encode/decode, etc)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles


 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Attachment: HIVE-6434.3.patch

rebased with trunk - patch v3

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles


 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Release Note: 
Restrict function create/drop to admin roles, if sql std auth is enabled. This 
would include temp/permanent functions, as well as macros.


  was:
Restrict function create/drop to admin roles, if sql std auth is enabled. This 
would include temp/permanent functions, as well as macros.

NO PRECOMMIT TESTS - dependent on HIVE-6330.


 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18162: HIVE-6434: Restrict function create/drop to admin roles

2014-02-21 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18162/
---

(Updated Feb. 22, 2014, 12:54 a.m.)


Review request for hive and Thejas Nair.


Changes
---

Only restrict create/drop of metastore functions. temp functions/macros not 
affected.


Bugs: HIVE-6434
https://issues.apache.org/jira/browse/HIVE-6434


Repository: hive-git


Description
---

Add output entity of DB object to make sure only admin roles can add/drop 
functions/macros.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
68a25e0 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
 7dfd574 
  ql/src/test/queries/clientnegative/authorization_create_func1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_create_func1.q PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_create_func1.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
393a3e8 
  ql/src/test/results/clientnegative/create_function_nonexistent_db.q.out 
ebb069e 
  ql/src/test/results/clientnegative/create_function_nonudf_class.q.out dd66afc 
  ql/src/test/results/clientnegative/udf_local_resource.q.out b6ea77d 
  ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out ad70d54 
  ql/src/test/results/clientpositive/authorization_create_func1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/create_func1.q.out 5a249c3 
  ql/src/test/results/clientpositive/udf_using.q.out 69e5f3b 

Diff: https://reviews.apache.org/r/18162/diff/


Testing
---

positive/negative q files added


Thanks,

Jason Dere

[jira] [Commented] (HIVE-6434) Restrict function create/drop to admin roles


[ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909068#comment-13909068
 ] 

Jason Dere commented on HIVE-6434:
--

patch v3 also only restricts create/drop of metastore functions

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles


 [ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6434:
-

Status: Patch Available  (was: Open)

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6458) Add schema upgrade scripts for metastore changes related to permanent functions


 [ 
https://issues.apache.org/jira/browse/HIVE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6458:
-

Attachment: HIVE-6458.1.patch

 Add schema upgrade scripts for metastore changes related to permanent 
 functions
 ---

 Key: HIVE-6458
 URL: https://issues.apache.org/jira/browse/HIVE-6458
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
 Attachments: HIVE-6458.1.patch


 Since HIVE-6330 has metastore changes, there need to be schema upgrade 
 scripts.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6439) Introduce CBO step in Semantic Analyzer

2014-02-21 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909075#comment-13909075
 ] 

Harish Butani commented on HIVE-6439:
-

yes agreed, let's add this post hive 0.13 branching.

 Introduce CBO step in Semantic Analyzer
 ---

 Key: HIVE-6439
 URL: https://issues.apache.org/jira/browse/HIVE-6439
 Project: Hive
  Issue Type: Sub-task
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6439.1.patch, HIVE-6439.2.patch, HIVE-6439.4.patch, 
 HIVE-6439.5.patch


 This patch introduces CBO step in SemanticAnalyzer. For now the 
 CostBasedOptimizer is an empty shell. 
 The contract between SemAly and CBO is:
 - CBO step  is controlled by the 'hive.enable.cbo.flag'. 
 - When true Hive SemAly will hand CBO a Hive Operator tree (with operators 
 annotated with stats). If it can CBO will return a better plan in Hive AST 
 form.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6458) Add schema upgrade scripts for metastore changes related to permanent functions


 [ 
https://issues.apache.org/jira/browse/HIVE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6458:
-

Assignee: Jason Dere
  Status: Patch Available  (was: Open)

 Add schema upgrade scripts for metastore changes related to permanent 
 functions
 ---

 Key: HIVE-6458
 URL: https://issues.apache.org/jira/browse/HIVE-6458
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6458.1.patch


 Since HIVE-6330 has metastore changes, there need to be schema upgrade 
 scripts.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6485) Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2

Vaibhav Gumashta created HIVE-6485:
--

 Summary: Downgrade to httpclient-4.2.5 in JDBC from 
httpclient-4.3.2
 Key: HIVE-6485
 URL: https://issues.apache.org/jira/browse/HIVE-6485
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


Had upgraded to the new version while adding SSL over Http mode support for 
HiveServer2. But that conflicts with httpclient-4.2.5 which is in hadoop 
classpath. I don't have a good reason to use httpclient-4.3.2, so it's better 
to match hadoop.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Timeline for the Hive 0.13 release?

2014-02-21 Thread Lefty Leverenz

That's appropriate -- let the Hive release march forth on March 4th.


-- Lefty


On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote:

 Ok,let’s set it for March 4th .

 regards,
 Harish.

 On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote:

  Might as well make it March 4th or 5th. Otherwise folks will burn
  weekend time to get patches in.
 
  On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com
 wrote:
  Yes makes sense.
  How about we postpone the branching until 10am PST March 3rd, which is
 the following Monday.
  Don’t see a point of setting the branch time to a Friday evening.
  Do people agree?
 
  regards,
  Harish.
 
  On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote:
 
  +1
 
  On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com
 wrote:
  Can we wait for some few more days for the branching ? I have a few
 more
  security fixes that I would like to get in, and we also have a long
  pre-commit queue ahead right now. How about branching around Friday
 next
  week ?  By then hadoop 2.3 should also be out as that vote has been
  concluded, and we can get HIVE-6037 in as well.
  -Thejas
 
 
 
  On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com
 wrote:
 
  I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it
 pending
  tests.
 
  Brock
 
  On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com
 wrote:
  HIVE-6037 is for generating hive-default.template file from
 HiveConf.
  Could
  it be included in this release? If it's not, I'll suspend further
  rebasing
  of it till next release (conflicts too frequently).
 
 
  2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com
 :
 
  I'll try to catch up on the wikidocs backlog for 0.13.0 patches in
 time
  for
  the release.  It's a long and growing list, though, so no promises.
 
  Feel free to do your own documentation, or hand it off to a
 friendly
  in-house writer.
 
  -- Lefty, self-appointed Hive docs maven
 
 
 
  On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair 
 the...@hortonworks.com
  wrote:
 
  Sounds good to me.
 
 
  On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Hi,
 
  Its mid feb. Wanted to check if the community is ready to cut a
  branch.
  Could we cut the branch in a week , say 5pm PST 2/21/14?
  The goal is to keep the release cycle short: couple of weeks; so
  after
  the
  branch we go into stabilizing mode for hive 0.13, checking in
 only
  blocker/critical bug fixes.
 
  regards,
  Harish.
 
 
  On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com
  wrote:
 
  Hi,
 
  I agree that picking a date to branch and then restricting
  commits to
  that
  branch would be a less time intensive plan for the RM.
 
  Brock
 
 
  On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani 
  hbut...@hortonworks.com
  wrote:
 
  Yes agree it is time to start planning for the next release.
  I would like to volunteer to do the release management duties
 for
  this
  release(will be a great experience for me)
  Will be happy to do it, if the community is fine with this.
 
  regards,
  Harish.
 
  On Jan 17, 2014, at 7:05 PM, Thejas Nair 
 the...@hortonworks.com
 
  wrote:
 
  Yes, I think it is time to start planning for the next
 release.
  For 0.12 release I created a branch and then accepted patches
  that
  people asked to be included for sometime, before moving a
 phase
  of
  accepting only critical bug fixes. This turned out to be
  laborious.
  I think we should instead give everyone a few weeks to get any
  patches
  they are working on to be ready, cut the branch, and take in
  only
  critical bug fixes to the branch after that.
  How about cutting the branch around mid-February and targeting
  to
  release in a week or two after that.
 
  Thanks,
  Thejas
 
 
  On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach 
 c...@apache.org
 
  wrote:
  I was wondering what people think about setting a tentative
  date
  for
  the
  Hive 0.13 release? At an old Hive Contrib meeting we agreed
  that
  Hive
  should follow a time-based release model with new releases
  every
  four
  months. If we follow that schedule we're due for the next
  release
  in
  mid-February.
 
  Thoughts?
 
  Thanks.
 
  Carl
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual
  or
  entity
  to
  which it is addressed and may contain information that is
  confidential,
  privileged and exempt from disclosure under applicable law. If
  the
  reader
  of this message is not the intended recipient, you are hereby
  notified
  that
  any printing, copying, dissemination, distribution, disclosure
  or
  forwarding of this communication is strictly prohibited. If
 you
  have
  received this communication in error, please contact the
 sender
  immediately
  and delete it from your system. Thank You.
 
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is

Re: Review Request 18208: Support LDAP authentication for HiveServer2 in http mode


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18208/#review35224
---



service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java
https://reviews.apache.org/r/18208/#comment65680

I assume these classes are not actually required for the LDAP changes. Lets 
include only the LDAP relevant changes to this jira.




service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
https://reviews.apache.org/r/18208/#comment65676

It will be better to use 
AuthenticationProviderFactory.getAuthenticationProvider here.
That way custom auth (and PAM) support also gets automatically added.




service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
https://reviews.apache.org/r/18208/#comment65678

doLdapAuth returning username is not intuitive. I think it is better to 
pass username and password to the function.




service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
https://reviews.apache.org/r/18208/#comment65677

This is not really an error in terms of the servers operation, ie does not 
affect the uptime of the server or other server problems.
I think we should just use info level logs if client failed to authorize 
correctly. The client is the one that should get error messages for this.



service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
https://reviews.apache.org/r/18208/#comment65679

I think we should just call this function with username and password as 
arguments.



- Thejas Nair


On Feb. 18, 2014, 10:29 a.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18208/
 ---
 
 (Updated Feb. 18, 2014, 10:29 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-6350
 https://issues.apache.org/jira/browse/HIVE-6350
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Support LDAP authentication for HiveServer2 in http mode
 
 
 Diffs
 -
 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa 
   service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java
  PRE-CREATION 
   service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java 
 PRE-CREATION 
   
 service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java
  5342214 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 bfe0e7b 
   
 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java 
 a6ff6ce 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 e77f043 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 9e9a60d 
 
 Diff: https://reviews.apache.org/r/18208/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

Review Request 18390: HS2 should return describe table results without space padding


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18390/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-4545
https://issues.apache.org/jira/browse/HIVE-4545


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-4545


Diffs
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
4df4dd5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 29f1e57 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java
 7fceb65 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 de788f7 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java
 b9be932 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 8173200 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 99b6d77 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
445c858 

Diff: https://reviews.apache.org/r/18390/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta

[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding


 [ 
https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-4545:
---

Attachment: HIVE-4545.5.patch

New Rb link: https://reviews.apache.org/r/18390/ (couldn't update the previous 
one as Thejas was the creator).

This patch gets rid of the new config that was introduced in the previous patch 
(per [~hagleitn]'s feedback) by adding a way to detect whether the query is 
being served from HiveServer2.

 HS2 should return describe table results without space padding
 --

 Key: HIVE-4545
 URL: https://issues.apache.org/jira/browse/HIVE-4545
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, 
 HIVE-4545.4.patch, HIVE-4545.5.patch


 HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE 
 FORMATTED table;'. HIVE-3140 introduced changes to not print header in 
 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for 
 the 'DESCRIBE table;' query.
 As the jdbc/odbc results are not for direct human consumption the space 
 padding should not be done for hive server2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding


 [ 
https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-4545:
---

Status: Patch Available  (was: Open)

 HS2 should return describe table results without space padding
 --

 Key: HIVE-4545
 URL: https://issues.apache.org/jira/browse/HIVE-4545
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, 
 HIVE-4545.4.patch, HIVE-4545.5.patch


 HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE 
 FORMATTED table;'. HIVE-3140 introduced changes to not print header in 
 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for 
 the 'DESCRIBE table;' query.
 As the jdbc/odbc results are not for direct human consumption the space 
 padding should not be done for hive server2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18390: HS2 should return describe table results without space padding


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18390/
---

(Updated Feb. 22, 2014, 2:22 a.m.)


Review request for hive, Gunther Hagleitner and Thejas Nair.


Bugs: HIVE-4545
https://issues.apache.org/jira/browse/HIVE-4545


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-4545


Diffs (updated)
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
4df4dd5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 29f1e57 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java
 7fceb65 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 de788f7 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java
 b9be932 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
 8173200 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 99b6d77 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
445c858 

Diff: https://reviews.apache.org/r/18390/diff/


Testing
---

TestJdbcDriver2


Thanks,

Vaibhav Gumashta

[jira] [Commented] (HIVE-4545) HS2 should return describe table results without space padding


[ 
https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909164#comment-13909164
 ] 

Vaibhav Gumashta commented on HIVE-4545:


[~hagleitn] Whenever you get time, the jira is up for review. Thanks in advance!

 HS2 should return describe table results without space padding
 --

 Key: HIVE-4545
 URL: https://issues.apache.org/jira/browse/HIVE-4545
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, 
 HIVE-4545.4.patch, HIVE-4545.5.patch


 HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE 
 FORMATTED table;'. HIVE-3140 introduced changes to not print header in 
 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for 
 the 'DESCRIBE table;' query.
 As the jdbc/odbc results are not for direct human consumption the space 
 padding should not be done for hive server2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6343) Allow Hive client to load hdfs paths from hive aux jars

2014-02-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6343:
-

Attachment: HIVE-6343.1.patch

Attaching patch v1. One thing I'm unsure about is how many times the remote 
file will get downloaded.  I would assume that any local processes spun off by 
the client would attempt to load the jars in hive.aux.jars.path (and thus 
download remote jars again), Hadoop tasks should not try to load 
hive.aux.jars.path, since they use -libjars, right?

 Allow Hive client to load hdfs paths from hive aux jars
 ---

 Key: HIVE-6343
 URL: https://issues.apache.org/jira/browse/HIVE-6343
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6343.1.patch


 Hive client will add local aux jars to class loader but will ignore hdfs 
 paths. We could have the client download hdfs files, similar to how ADD JAR 
 does.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6084) WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511


[ 
https://issues.apache.org/jira/browse/HIVE-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909177#comment-13909177
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-6084:
-

Patch attached. The patch is based on the tests run over Hadoop 2.

 WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511
 --

 Key: HIVE-6084
 URL: https://issues.apache.org/jira/browse/HIVE-6084
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Hari Sankar Sivarama Subramaniyan





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases

2014-02-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909178#comment-13909178
 ] 

Sergey Shelukhin commented on HIVE-6429:


Grafting this onto binarysortableserde will take a little bit of effort... will 
attach patch late evening today, or on Sunday evening

 MapJoinKey has large memory overhead in typical cases
 -

 Key: HIVE-6429
 URL: https://issues.apache.org/jira/browse/HIVE-6429
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, 
 HIVE-6429.03.patch, HIVE-6429.WIP.patch, HIVE-6429.patch


 The only thing that MJK really needs it hashCode and equals (well, and 
 construction), so there's no need to have array of writables in there. 
 Assuming all the keys for a table have the same structure, for the common 
 case where keys are primitive types, we can store something like a byte array 
 combination of keys to reduce the memory usage. Will probably speed up 
 compares too.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6084) WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511

2014-02-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6084:


Attachment: HIVE-6084.1.patch

cc [~sushanth] and [~ekoifman] for review

 WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511
 --

 Key: HIVE-6084
 URL: https://issues.apache.org/jira/browse/HIVE-6084
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6084.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-02-21 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6455:
-

Attachment: HIVE-6455.7.patch

Added a fix that solved an issue with stats aggregation when stats aggregation 
key exceeds the max key prefix length. Fixed other failing tests.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, 
 HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, 
 HIVE-6455.6.patch, HIVE-6455.7.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.

Shivaraju Gowda created HIVE-6486:
-

 Summary: Support secure Subject.doAs() in HiveServer2 JDBC client.
 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda


HIVE-5155 addresses the problem of kerberos authentication in multi-user 
middleware server using proxy user.  In this mode the principal used by the 
middle ware server has privileges to impersonate selected users in Hive/Hadoop. 

This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
layer so that the end users Kerberos Subject is passed through in the middle 
ware server. With this improvement there won't be any additional setup in the 
server to grant proxy privileges to some users and there won't be need to 
specify a proxy user in the JDBC client. This version should also be more 
secure since it won't require principals with the privileges to impersonate 
other users in Hive/Hadoop setup.
 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


 [ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaraju Gowda updated HIVE-6486:
--

Status: Patch Available  (was: Open)

Usage :
 Add identityContext=fromKerberosSubject in the URL to enable it
Ex: 
jdbc:hive2://hive.example.com:1/default;principal=hive/localhost.localdom...@example.com;identityContext=fromKerberosSubject;


 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda

 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


 [ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaraju Gowda updated HIVE-6486:
--

Status: Open  (was: Patch Available)

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


 [ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivaraju Gowda updated HIVE-6486:
--

Status: Patch Available  (was: Open)

The attached patch Hive_011_Support-Subject_doAS.patch contains a fix on top of 
 Hive 0.11 head. 

To enable the feature Add identityContext=fromKerberosSubject in the  JDBC 
URL 
Ex: 
jdbc:hive2://hive.example.com:1/default;principal=hive/localhost.localdom...@example.com;identityContext=fromKerberosSubject;

The patch will affect only two jars.
hive-jdbc-0.11.0.jar
hive-service-0.11.0.jar

 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0, 0.11.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.


[ 
https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909225#comment-13909225
 ] 

Shivaraju Gowda commented on HIVE-6486:
---

Other than the use case in the Description of this issue, the attached patch 
will also enhance Kerberos support in Hive JDBC driver by allowing the user to 
programmatically login to the kerberos(i.e without a key tab or ticket cache, 
etc.). Furthermore this is done without the dependency on the other component's 
jars(hadoop-core*.jar).


 Support secure Subject.doAs() in HiveServer2 JDBC client.
 -

 Key: HIVE-6486
 URL: https://issues.apache.org/jira/browse/HIVE-6486
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.11.0, 0.12.0
Reporter: Shivaraju Gowda
 Attachments: Hive_011_Support-Subject_doAS.patch


 HIVE-5155 addresses the problem of kerberos authentication in multi-user 
 middleware server using proxy user.  In this mode the principal used by the 
 middle ware server has privileges to impersonate selected users in 
 Hive/Hadoop. 
 This enhancement is to support Subject.doAs() authentication in  Hive JDBC 
 layer so that the end users Kerberos Subject is passed through in the middle 
 ware server. With this improvement there won't be any additional setup in the 
 server to grant proxy privileges to some users and there won't be need to 
 specify a proxy user in the JDBC client. This version should also be more 
 secure since it won't require principals with the privileges to impersonate 
 other users in Hive/Hadoop setup.
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Precommit queue

Hi Brock,
Do you know why the tests are taking almost twice as long in recent runs ?
Is it related to the ec2 spot price spikes ?
Thanks,
Thejas


On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote:

 There was a ec2 spot price spike overnight which combined with
 everyone trying to get patches in for the branching has resulted in a
 massive queue:

 http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/

 ~25 builds in the queue

 Brock


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

[jira] [Commented] (HIVE-6393) Support unqualified column references in Joining conditions