[jira] [Commented] (HIVE-3050) JDBC should provide metadata for columns whether a column is a partition column or not
[ https://issues.apache.org/jira/browse/HIVE-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908102#comment-13908102 ] chandra sekhar gunturi commented on HIVE-3050: -- This is required even for creating load command when hive table is partitioned. When the Hive table is partitioned, load command should have 'partitioned' clause. To give the load command programatically, depending on metadata, this functionality is useful. JDBC should provide metadata for columns whether a column is a partition column or not -- Key: HIVE-3050 URL: https://issues.apache.org/jira/browse/HIVE-3050 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Minor Trivial request from UI developers. {code} DatabaseMetaData databaseMetaData = connection.getMetaData(); ResultSet rs = databaseMetaData.getColumns(null, null, tableName, null); boolean partitionKey = rs.getBoolean(IS_PARTITION_COLUMN); {code} It's not JDBC standard column but seemed to be useful. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Parquet support (HIVE-5783)
Hi, Storage handlers muddle the waters a bit IMO. That interface was written for storage that is not file-based, e.g. hbase. Whereas Avro, Parquet, Sequence File, etc are all file based. I think we have to be practical about confusion. There are so many Hadoop newbies out there, almost all of them new to Apache as well, that there is going to be some confusion. For example, one person who had been using Hadoop and Hive for a few months said to me Hive moved *from* Apache to Hortonworks. At the end of the day, regardless of what we do, some level of confusion is going to persist amongst those new to the ecosystem. With that said, I do think that an overview of Hive Storage would be a great addition to our documentation. Brock On Fri, Feb 21, 2014 at 1:27 AM, Lefty Leverenz leftylever...@gmail.com wrote: This is in the Terminology sectionhttps://cwiki.apache.org/confluence/display/Hive/StorageHandlers#StorageHandlers-Terminology of the Storage Handlers doc: Storage handlers introduce a distinction between *native* and *non-native* tables. A native table is one which Hive knows how to manage and access without a storage handler; a non-native table is one which requires a storage handler. It goes on to say that non-native tables are created with a STORED BY clause (as opposed to a STORED AS clause). Does that clarify or muddy the waters? -- Lefty On Thu, Feb 20, 2014 at 7:37 PM, Lefty Leverenz leftylever...@gmail.comwrote: Some of these issues can be addressed in the documentation. The File Formats section of the Language Manual needs an overview, and that might be a good place to explain the differences between Hive-owned formats and external formats. Or the SerDe doc could be beefed up: Built-In SerDeshttps://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-Built-inSerDes . In the meantime, I've added a link to the Avro doc in the File Formats list and mentioned Parquet in DDL's Row Format, Storage Format, and SerDehttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDesection: Use STORED AS PARQUET (without ROW FORMAT SERDE) for the Parquethttps://cwiki.apache.org/confluence/display/Hive/Parquet columnar storage format in Hive 0.13.0 and laterhttps://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.13andlater; or use ROW FORMAT SERDE ... STORED AS INPUTFORMAT ... OUTPUTFORMAT ... in Hive 0.10, 0.11, or 0.12https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.10-0.12 . Does that work? -- Lefty On Tue, Feb 18, 2014 at 1:31 PM, Brock Noland br...@cloudera.com wrote: Hi Alan, Response is inline, below: On Tue, Feb 18, 2014 at 11:49 AM, Alan Gates ga...@hortonworks.com wrote: Gunther, is it the case that there is anything extra that needs to be done to ship Parquet code with Hive right now? If I read the patch correctly the Parquet jars were added to the pom and thus will be shipped as part of Hive. As long as it works out of the box when a user says create table ... stored as parquet why do we care whether the parquet jar is owned by Hive or another project? The concern about feature mismatch in Parquet versus Hive is valid, but I'm not sure what to do about it other than assure that there are good error messages. Users will often want to use non-Hive based storage formats (Parquet, Avro, etc.). This means we need a good way to detect at SQL compile time that the underlying storage doesn't support the indicated data type and throw a good error. Agreed, the error messages should absolutely be good. I will ensure this is the case via https://issues.apache.org/jira/browse/HIVE-6457 Also, it's important to be clear going forward about what Hive as a project is signing up for. If tomorrow someone decides to add a new datatype or feature we need to be clear that we expect the contributor to make this work for Hive owned formats (text, RC, sequence, ORC) but not necessarily for external formats This makes sense to me. I'd just like to add that I have a patch available to improve the hive-exec uber jar and general query speed: https://issues.apache.org/jira/browse/HIVE-860. Additionally I have a patch available to finish the generic STORED AS functionality: https://issues.apache.org/jira/browse/HIVE-5976 Brock -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
[jira] [Commented] (HIVE-6467) metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar
[ https://issues.apache.org/jira/browse/HIVE-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908369#comment-13908369 ] Hive QA commented on HIVE-6467: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629932/HIVE-6467.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5141 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1433/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1433/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629932 metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar -- Key: HIVE-6467 URL: https://issues.apache.org/jira/browse/HIVE-6467 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6467.1.patch Trying to tinker with the metastore upgrade scripts and did the following steps on a brand new Derby DB: From derby: {noformat} run 'hive-schema-0.12.0.derby.sql'; run 'upgrade-0.12.0-to-0.13.0.derby.sql'; {noformat} From Hive: {noformat} show tables; {noformat} I then hit the following error below. It appears that in the metastore DBS table, the row with defaultdb was created with the value ROLE , with spaces at the end, where it was expecting ROLE. {noformat} 2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: No enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE at java.lang.Enum.valueOf(Enum.java:196) at org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14) at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) at com.sun.proxy.$Proxy7.getDatabase(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy8.get_database(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy9.getDatabase(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150) at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at
Precommit queue
There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock
[jira] [Updated] (HIVE-6467) metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar
[ https://issues.apache.org/jira/browse/HIVE-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6467: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Jason! metastore upgrade script 016-HIVE-6386.derby.sql uses char rather than varchar -- Key: HIVE-6467 URL: https://issues.apache.org/jira/browse/HIVE-6467 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.13.0 Attachments: HIVE-6467.1.patch Trying to tinker with the metastore upgrade scripts and did the following steps on a brand new Derby DB: From derby: {noformat} run 'hive-schema-0.12.0.derby.sql'; run 'upgrade-0.12.0-to-0.13.0.derby.sql'; {noformat} From Hive: {noformat} show tables; {noformat} I then hit the following error below. It appears that in the metastore DBS table, the row with defaultdb was created with the value ROLE , with spaces at the end, where it was expecting ROLE. {noformat} 2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: No enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE at java.lang.Enum.valueOf(Enum.java:196) at org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14) at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) at com.sun.proxy.$Proxy7.getDatabase(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy8.get_database(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy9.getDatabase(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150) at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1566) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1339) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1010) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1000) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at
[jira] [Commented] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
[ https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908566#comment-13908566 ] Hive QA commented on HIVE-6464: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629799/HIVE-6464.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5169 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1435/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1435/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629799 Test configuration: reduce the duration for which lock attempts are retried --- Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases
On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 177 https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line177 this seems ugly to me. can't you delegate to the hashtable loader to decide which class to use? this is in the operator, hashtable is already loaded so there's no loader, and key is already decided. We want to ensure that 1) We use the same key as that table. 2) We don't make decision for each key separately rechecking again and again... but to decide based on previous key we need previous key. On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 206 https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line206 the old code complies with the coding guidelines... can you change back? there are coding guidelines? :) On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java, line 104 https://reviews.apache.org/r/18230/diff/3/?file=499107#file499107line104 what causes the warning? cast to List On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBase.java, line 98 https://reviews.apache.org/r/18230/diff/3/?file=499108#file499108line98 looking for subclasses in the base is um not so nice. can't you avoid this static stuff? just make read a member and override the approriate stuff. you're already passing in a ref object, so why not call read on that? these methods actually create a class. Ref can be null (for the first key when we determine what to use), and it can be non-reusable (e.g. when loading hashtable because it's the key from table). Should static stuff be moved to a separate factory class? On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java, line 64 https://reviews.apache.org/r/18230/diff/3/?file=499109#file499109line64 this and the next field (vectorized) is going to be neigh impossible to maintain. you really care about fixed length here is that it? we should add this to objectinstpectorutils or something like that. i think there's already some code in hive that returns size of datatype to you. These sizes are actually the ones DataOutput implementation below writes. The thing I care about is that serialized keys are comparable. Why would they be impossible to maintain? - Sergey --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/#review35132 --- On Feb. 20, 2014, 7:46 p.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/ --- (Updated Feb. 20, 2014, 7:46 p.m.) Review request for hive, Gunther Hagleitner and Jitendra Pandey. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 3cfaacf ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 24f1229 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 61545b5 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 2ac0928 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBase.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9ce0ae6 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 83ba0f0 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java 581046e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 2466a3b ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java c541ad2 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java a103a51 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 2cb1ac3 Diff: https://reviews.apache.org/r/18230/diff/ Testing --- Thanks, Sergey Shelukhin
Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/ --- (Updated Feb. 21, 2014, 6:40 p.m.) Review request for hive, Gunther Hagleitner and Jitendra Pandey. Repository: hive-git Description --- See JIRA Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 24f1229 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 61545b5 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 2ac0928 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyObject.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9ce0ae6 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 83ba0f0 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java 581046e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 2466a3b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 997202f ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java c541ad2 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java a103a51 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 61c5741 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 2cb1ac3 Diff: https://reviews.apache.org/r/18230/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Updated] (HIVE-6429) MapJoinKey has large memory overhead in typical cases
[ https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6429: --- Attachment: HIVE-6429.03.patch MapJoinKey has large memory overhead in typical cases - Key: HIVE-6429 URL: https://issues.apache.org/jira/browse/HIVE-6429 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, HIVE-6429.03.patch, HIVE-6429.WIP.patch, HIVE-6429.patch The only thing that MJK really needs it hashCode and equals (well, and construction), so there's no need to have array of writables in there. Assuming all the keys for a table have the same structure, for the common case where keys are primitive types, we can store something like a byte array combination of keys to reduce the memory usage. Will probably speed up compares too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Timeline for the Hive 0.13 release?
Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com: I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce -
Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/ --- (Updated Feb. 21, 2014, 7:05 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 Repository: hive-git Description --- Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466 Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 pom.xml 9aef665 service/pom.xml b1002e2 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java b92fd83 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/18291/diff/ Testing --- Thanks, Vaibhav Gumashta
Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/#review35180 --- service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java https://reviews.apache.org/r/18291/#comment65606 Done. Thanks for taking a look. - Vaibhav Gumashta On Feb. 19, 2014, 11:44 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/ --- (Updated Feb. 19, 2014, 11:44 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 Repository: hive-git Description --- Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 pom.xml 9aef665 service/pom.xml b1002e2 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java b92fd83 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/18291/diff/ Testing --- Thanks, Vaibhav Gumashta
Re: Timeline for the Hive 0.13 release?
+1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com: I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/ --- (Updated Feb. 21, 2014, 7:06 p.m.) Review request for hive, Mohammad Islam and Thejas Nair. Bugs: HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 Repository: hive-git Description --- Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 pom.xml 9aef665 service/pom.xml b1002e2 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java b92fd83 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/18291/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive
[ https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6466: --- Attachment: HIVE-6466.2.patch Add support for pluggable authentication modules (PAM) in Hive -- Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive
[ https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6466: --- Status: Patch Available (was: Open) Add support for pluggable authentication modules (PAM) in Hive -- Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive
[ https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908694#comment-13908694 ] Xuefu Zhang commented on HIVE-6466: --- Thanks for the explanation, [~vaibhavgumashta]. Add support for pluggable authentication modules (PAM) in Hive -- Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18291: Add support for pluggable authentication modules (PAM) in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/#review35183 --- Ship it! Ship It! - Thejas Nair On Feb. 21, 2014, 7:06 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/ --- (Updated Feb. 21, 2014, 7:06 p.m.) Review request for hive, Mohammad Islam and Thejas Nair. Bugs: HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 Repository: hive-git Description --- Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 pom.xml 9aef665 service/pom.xml b1002e2 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java b92fd83 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/18291/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-6466) Add support for pluggable authentication modules (PAM) in Hive
[ https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908703#comment-13908703 ] Thejas M Nair commented on HIVE-6466: - +1 Please include documentation in release notes. Add support for pluggable authentication modules (PAM) in Hive -- Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6466.1.patch, HIVE-6466.2.patch More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html Usage from JPAM api: http://jpam.sourceforge.net/JPamUserGuide.html#id.s7.1 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases
On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java, line 64 https://reviews.apache.org/r/18230/diff/3/?file=499109#file499109line64 this and the next field (vectorized) is going to be neigh impossible to maintain. you really care about fixed length here is that it? we should add this to objectinstpectorutils or something like that. i think there's already some code in hive that returns size of datatype to you. Sergey Shelukhin wrote: These sizes are actually the ones DataOutput implementation below writes. The thing I care about is that serialized keys are comparable. Why would they be impossible to maintain? A different thought: Have you considered using HiveKey/BinarySortableSerDe for this? Would this create more overhead? I think that SerDe uses vInts for the datatypes you support and the result is binary comparable. There might be some fixed overhead you don't want - but if we could reuse some of that code there wouldn't be a problem of maintaining this specific key stuff. The sizes you have in the map look like the java datatype sizes, that's why I was suggesting using the Utils. Either way - if you could move that logic (partly) to the proper utils there's a better chance that someone adding/changing datatypes will catch it. On Feb. 21, 2014, 4:44 a.m., Gunther Hagleitner wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 177 https://reviews.apache.org/r/18230/diff/3/?file=499105#file499105line177 this seems ugly to me. can't you delegate to the hashtable loader to decide which class to use? Sergey Shelukhin wrote: this is in the operator, hashtable is already loaded so there's no loader, and key is already decided. We want to ensure that 1) We use the same key as that table. 2) We don't make decision for each key separately rechecking again and again... but to decide based on previous key we need previous key. I don't understand the arguments. The hashtable loader is part of the operator, so you can still use it, you don't have to make the decision every time, it's up to you how you implement that in the loader and the loader knows what table it created so it's a good place to enforce conformity, isn't it? - Gunther --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/#review35132 --- On Feb. 21, 2014, 6:40 p.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/ --- (Updated Feb. 21, 2014, 6:40 p.m.) Review request for hive, Gunther Hagleitner and Jitendra Pandey. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 24f1229 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 61545b5 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 2ac0928 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyObject.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9ce0ae6 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 83ba0f0 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java 581046e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 2466a3b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 997202f ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java c541ad2 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java a103a51 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 61c5741 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 2cb1ac3 Diff: https://reviews.apache.org/r/18230/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Commented] (HIVE-5155) Support secure proxy user access to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908714#comment-13908714 ] Thejas M Nair commented on HIVE-5155: - Prasad, It would be great to get this patch in for 0.13 release. I think just the issue of proxy user config parameter needs to be addressed. ie having a specific config for HS2 proxy privileges so that the user does not have to be made a hdfs/MR wide proxy user. Support secure proxy user access to HiveServer2 --- Key: HIVE-5155 URL: https://issues.apache.org/jira/browse/HIVE-5155 Project: Hive Issue Type: Improvement Components: Authentication, HiveServer2, JDBC Affects Versions: 0.12.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-5155-1-nothrift.patch, HIVE-5155-noThrift.2.patch, HIVE-5155-noThrift.4.patch, HIVE-5155-noThrift.5.patch, HIVE-5155-noThrift.6.patch, HIVE-5155.1.patch, HIVE-5155.2.patch, HIVE-5155.3.patch, ProxyAuth.java, ProxyAuth.out, TestKERBEROS_Hive_JDBC.java The HiveServer2 can authenticate a client using via Kerberos and impersonate the connecting user with underlying secure hadoop. This becomes a gateway for a remote client to access secure hadoop cluster. Now this works fine for when the client obtains Kerberos ticket and directly connects to HiveServer2. There's another big use case for middleware tools where the end user wants to access Hive via another server. For example Oozie action or Hue submitting queries or a BI tool server accessing to HiveServer2. In these cases, the third party server doesn't have end user's Kerberos credentials and hence it can't submit queries to HiveServer2 on behalf of the end user. This ticket is for enabling proxy access to HiveServer2 for third party tools on behalf of end users. There are two parts of the solution proposed in this ticket: 1) Delegation token based connection for Oozie (OOZIE-1457) This is the common mechanism for Hadoop ecosystem components. Hive Remote Metastore and HCatalog already support this. This is suitable for tool like Oozie that submits the MR jobs as actions on behalf of its client. Oozie already uses similar mechanism for Metastore/HCatalog access. 2) Direct proxy access for privileged hadoop users The delegation token implementation can be a challenge for non-hadoop (especially non-java) components. This second part enables a privileged user to directly specify an alternate session user during the connection. If the connecting user has hadoop level privilege to impersonate the requested userid, then HiveServer2 will run the session as that requested user. For example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy user configuration). Then user Hue can connect to HiveServer2 and specify Bob as session user via a session property. HiveServer2 will verify Hue's proxy user privilege and then impersonate user Bob instead of Hue. This will enable any third party tool to impersonate alternate userid without having to implement delegation token connection. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows
[ https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908737#comment-13908737 ] Jason Dere commented on HIVE-5176: -- Taking a closer look a this patch, it's a mix of several patches done for Windows work. I'm going to try to split this into smaller patches with each specific change. Wincompat : Changes for allowing various path compatibilities with Windows -- Key: HIVE-5176 URL: https://issues.apache.org/jira/browse/HIVE-5176 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5176.2.patch, HIVE-5176.patch We need to make certain changes across the board to allow us to read/parse windows paths. Some are escaping changes, some are being strict about how we read paths (through URL.encode/decode, etc) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6480) Metastore server startup script ignores ENV settings
Adam Faris created HIVE-6480: Summary: Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Faris updated HIVE-6480: - Attachment: HIVE-6480.01.patch Attaching 'git diff' output against trunk. Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Faris updated HIVE-6480: - Status: Patch Available (was: Open) Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Timeline for the Hive 0.13 release?
Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com: I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org --
Re: Timeline for the Hive 0.13 release?
Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com: I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender
[jira] [Commented] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958
[ https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908800#comment-13908800 ] Thejas M Nair commented on HIVE-6479: - +1 . I don't think we need to wait for the full unit test suite to kick in or for 24hours for this one, as it just updates 2 .q files. I will commit it after verifying that these two tests pass with this change. Few .q.out files need to be updated post HIVE-5958 -- Key: HIVE-6479 URL: https://issues.apache.org/jira/browse/HIVE-6479 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6479.patch See my comment https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908801#comment-13908801 ] Hive QA commented on HIVE-6455: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630239/HIVE-6455.6.patch {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 5170 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_union {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1436/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1436/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630239 Scalable dynamic partitioning and bucketing optimization Key: HIVE-6455 URL: https://issues.apache.org/jira/browse/HIVE-6455 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: optimization Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, HIVE-6455.6.patch The current implementation of dynamic partition works by keeping at least one record writer open per dynamic partition directory. In case of bucketing there can be multispray file writers which further adds up to the number of open record writers. The record writers of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or compression buffers) open all the time to buffer up the rows and compress them before flushing it to disk. Since these buffers are maintained per column basis the amount of constant memory that will required at runtime increases as the number of partitions and number of columns per partition increases. This often leads to OutOfMemory (OOM) exception in mappers or reducers depending on the number of open record writers. Users often tune the JVM heapsize (runtime memory) to get over such OOM issues. With this optimization, the dynamic partition columns and bucketing columns (in case of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and bucketing columns are sorted, each reducers can keep only one record writer open at any time thereby reducing the memory pressure on the reducers. This optimization is highly scalable as the number of partition and number of columns per partition increases at the cost of sorting the columns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6480) Metastore server startup script ignores ENV settings
[ https://issues.apache.org/jira/browse/HIVE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908807#comment-13908807 ] Adam Faris commented on HIVE-6480: -- Reviewboard link https://reviews.apache.org/r/18373/ Metastore server startup script ignores ENV settings Key: HIVE-6480 URL: https://issues.apache.org/jira/browse/HIVE-6480 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Adam Faris Priority: Minor Attachments: HIVE-6480.01.patch This is a minor issue with hcat_server.sh. Currently the startup script has HADOOP_HEAPSIZE hardcoded to 2048, causing administrators to hand edit the script. As hcat_server.sh reads hcat-env.sh, it makes sense to allow an administrator to define HADOOP_HEAPSIZE in hcat-env.sh (or other location like /etc/profile). Secondly, there is no defined default for METASTORE_PORT in hcat_server.sh. If METASTORE_PORT is missing, the metastore server fails to start. I will attach a patch in my next update, once this jira is opened. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958
[ https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908806#comment-13908806 ] Thejas M Nair edited comment on HIVE-6479 at 2/21/14 8:59 PM: -- Verified that the 2 tests pass with this .q.out file update . I am planning to commit this in another 1/2 hour. It will get rid of false alarms for the pending precommit tests. was (Author: thejas): Verified that the 2 tests pass with this .q.out file update . I am planning to commit this in another 1/2 hour. Few .q.out files need to be updated post HIVE-5958 -- Key: HIVE-6479 URL: https://issues.apache.org/jira/browse/HIVE-6479 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6479.patch See my comment https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6479) Few .q.out files need to be updated post HIVE-5958
[ https://issues.apache.org/jira/browse/HIVE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908806#comment-13908806 ] Thejas M Nair commented on HIVE-6479: - Verified that the 2 tests pass with this .q.out file update . I am planning to commit this in another 1/2 hour. Few .q.out files need to be updated post HIVE-5958 -- Key: HIVE-6479 URL: https://issues.apache.org/jira/browse/HIVE-6479 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6479.patch See my comment https://issues.apache.org/jira/browse/HIVE-6433?focusedCommentId=13907782page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13907782 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
[ https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6464: Resolution: Fixed Status: Resolved (was: Patch Available) The two test failures are unrelated. See HIVE-6479 . Patch committed to trunk. Thanks for the review [~navis]! Test configuration: reduce the duration for which lock attempts are retried --- Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
[ https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6464: Fix Version/s: 0.13.0 Test configuration: reduce the duration for which lock attempts are retried --- Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Timeline for the Hive 0.13 release?
Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com: I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination,
[jira] [Created] (HIVE-6481) Add .reviewboardrc file
Carl Steinbach created HIVE-6481: Summary: Add .reviewboardrc file Key: HIVE-6481 URL: https://issues.apache.org/jira/browse/HIVE-6481 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach We should add a .reviewboardrc file to trunk in order to streamline the review process. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6482) Fix NOTICE file: pre release task
[ https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6482: Attachment: HIVE-6482.1.patch Fix NOTICE file: pre release task - Key: HIVE-6482 URL: https://issues.apache.org/jira/browse/HIVE-6482 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Attachments: HIVE-6482.1.patch As per steps in Release doc: https://cwiki.apache.org/confluence/display/Hive/HowToRelease Removed projects with Apache license as per [~thejas] suggestion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6482) Fix NOTICE file: pre release task
[ https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908884#comment-13908884 ] Thejas M Nair commented on HIVE-6482: - All these libraries except for jersey are under MIT/BSD license or derivatives. Only one that I am not sure of is jsersey library license (CDDL) its too long to read. Might as well put it in the NOTICE section to be safe. An interesting note : JSON license also adds that The Software shall be used for Good, not Evil !! :) Fix NOTICE file: pre release task - Key: HIVE-6482 URL: https://issues.apache.org/jira/browse/HIVE-6482 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Attachments: HIVE-6482.1.patch As per steps in Release doc: https://cwiki.apache.org/confluence/display/Hive/HowToRelease Removed projects with Apache license as per [~thejas] suggestion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6482) Fix NOTICE file: pre release task
[ https://issues.apache.org/jira/browse/HIVE-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908887#comment-13908887 ] Thejas M Nair commented on HIVE-6482: - I mean to say that we can remove all the notices from the NOTICE file except for maybe jersey. If someone can verify that it also does not have such an attribution required, we can remove that from NOTICE file as well. cc [~cwsteinbach] Fix NOTICE file: pre release task - Key: HIVE-6482 URL: https://issues.apache.org/jira/browse/HIVE-6482 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Attachments: HIVE-6482.1.patch As per steps in Release doc: https://cwiki.apache.org/confluence/display/Hive/HowToRelease Removed projects with Apache license as per [~thejas] suggestion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6468) HS2 out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abin Shahab updated HIVE-6468: -- Description: We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) was: We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) beeline -u jdbc:hive2://localhost:1 -n user1 -d org.apache.hive.jdbc.HiveDriver -e create table test1 (id) int; Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Summary: HS2 out of memory error when curl sends a get request (was: HS2 out of memory error with Beeline) HS2 out of memory error when curl sends a get request - Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6461) Run Release Audit tool, fix missing license issues
[ https://issues.apache.org/jira/browse/HIVE-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6461: Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks Harish! Run Release Audit tool, fix missing license issues -- Key: HIVE-6461 URL: https://issues.apache.org/jira/browse/HIVE-6461 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-6461.1.patch run mvn apache-rat:check and add apache license in flagged files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type
[ https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-3635: -- Status: Patch Available (was: Open) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type --- Key: HIVE-3635 URL: https://issues.apache.org/jira/browse/HIVE-3635 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.9.0 Reporter: Alexander Alten-Lorenz Assignee: Xuefu Zhang Attachments: HIVE-3635.1.patch, HIVE-3635.patch interpret t as true and f as false for boolean types. PostgreSQL exports represent it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type
[ https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-3635: -- Attachment: HIVE-3635.1.patch Patch #1 is based on #0, but provides configuration parameter and test. allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type --- Key: HIVE-3635 URL: https://issues.apache.org/jira/browse/HIVE-3635 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.9.0 Reporter: Alexander Alten-Lorenz Assignee: Xuefu Zhang Attachments: HIVE-3635.1.patch, HIVE-3635.patch interpret t as true and f as false for boolean types. PostgreSQL exports represent it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6483) Hive on Tez - Hive should create different payloads for inputs and outputs
Bikas Saha created HIVE-6483: Summary: Hive on Tez - Hive should create different payloads for inputs and outputs Key: HIVE-6483 URL: https://issues.apache.org/jira/browse/HIVE-6483 Project: Hive Issue Type: Bug Reporter: Bikas Saha Currently, Hive creates a single vertex payload that is implicitly shared with Inputs and Outputs. This creates confusion in the Tez API and configuration. Tracked by TEZ-696 and TEZ-872 which are blocked by this jira. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18377: HIVE-6481. Add .reviewboardrc file
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18377/#review35201 --- Looks good. Minor question: do we need to put Apache license header for this new file? - Xuefu Zhang On Feb. 21, 2014, 9:36 p.m., Carl Steinbach wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18377/ --- (Updated Feb. 21, 2014, 9:36 p.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-6481 https://issues.apache.org/jira/browse/HIVE-6481 Repository: hive-git Description --- HIVE-6481. Add .reviewboardrc file Diffs - .reviewboardrc PRE-CREATION Diff: https://reviews.apache.org/r/18377/diff/ Testing --- I was able to post this review request with the command rbt post after committing my changes locally. Thanks, Carl Steinbach
[jira] [Updated] (HIVE-5950) ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes
[ https://issues.apache.org/jira/browse/HIVE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5950: - Attachment: HIVE-5950.4.patch Refreshed the patch to trunk. ORC SARG creation fails with NPE for predicate conditions with decimal/date/char/varchar datatypes -- Key: HIVE-5950 URL: https://issues.apache.org/jira/browse/HIVE-5950 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-5950.1.patch, HIVE-5950.2.patch, HIVE-5950.3.patch, HIVE-5950.4.patch When decimal or date column is used, the type field in PredicateLeafImpl will be set to null. This will result in NPE during predicate leaf generation because of null dereferencing in hashcode computation. SARG creation should be extended to support/handle decimal and date data types. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18185: Support Kerberos HTTP authentication for HiveServer2 running in http mode
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18185/#review35196 --- jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java https://reviews.apache.org/r/18185/#comment65649 I believe using the hadoop classes here will require hadoop-common*jar also to be copied to jdbc client machine. We should move the use of these classes to a different class, that would get used only when jdbc+kerberos is used. jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java https://reviews.apache.org/r/18185/#comment65650 the flow will be clearer with an else-if instead of if. jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java https://reviews.apache.org/r/18185/#comment65652 If useSsl == true, I think we should throw an exception (with http-kerberos) with appropriate error message. Otherwise people would have a false sense of security of ssl being used. jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java https://reviews.apache.org/r/18185/#comment65653 can you get rid of these trailing white spaces - Thejas Nair On Feb. 17, 2014, 9:24 a.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18185/ --- (Updated Feb. 17, 2014, 9:24 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-4764 https://issues.apache.org/jira/browse/HIVE-4764 Repository: hive-git Description --- Support Kerberos HTTP authentication for HiveServer2 running in http mode Diffs - jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 13fc19b jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 66eba1b jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java PRE-CREATION service/src/java/org/apache/hive/service/cli/CLIService.java 56b357a service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java 6fbc847 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java a6ff6ce service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java e77f043 Diff: https://reviews.apache.org/r/18185/diff/ Testing --- Thanks, Vaibhav Gumashta
Review Request 18382: HIVE-3635: allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18382/ --- Review request for hive. Bugs: HIVE-3635 https://issues.apache.org/jira/browse/HIVE-3635 Repository: hive-git Description --- 1. Implemented the functionality, allowing LazyBoolean to accepts these literals. 2. Added a configuration to control the functionality. Off by default. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 237b669 conf/hive-default.xml.template f7f50e3 ql/src/test/queries/clientpositive/bool_literal.q PRE-CREATION ql/src/test/results/clientpositive/bool_literal.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java c741c3a serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 66f79ed serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 606208c serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBooleanObjectInspector.java 2cf7362 serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java 5f64697 Diff: https://reviews.apache.org/r/18382/diff/ Testing --- Added .q test which tests the behavior in cases of whether the functionality is turned on or not. Thanks, Xuefu Zhang
Re: Review Request 18382: HIVE-3635: allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18382/ --- (Updated Feb. 21, 2014, 11:09 p.m.) Review request for hive. Bugs: HIVE-3635 https://issues.apache.org/jira/browse/HIVE-3635 Repository: hive-git Description --- 1. Implemented the functionality, allowing LazyBoolean to accepts these literals. 2. Added a configuration to control the functionality. Off by default. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 237b669 conf/hive-default.xml.template f7f50e3 data/files/bool_literal.txt PRE-CREATION ql/src/test/queries/clientpositive/bool_literal.q PRE-CREATION ql/src/test/results/clientpositive/bool_literal.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBoolean.java c741c3a serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java 66f79ed serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 606208c serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBooleanObjectInspector.java 2cf7362 serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java 5f64697 Diff: https://reviews.apache.org/r/18382/diff/ Testing --- Added .q test which tests the behavior in cases of whether the functionality is turned on or not. Thanks, Xuefu Zhang
[jira] [Updated] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type
[ https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-3635: -- Attachment: HIVE-3635.2.patch Patch #2 removed trailing spaces and added missing data file. allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type --- Key: HIVE-3635 URL: https://issues.apache.org/jira/browse/HIVE-3635 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.9.0 Reporter: Alexander Alten-Lorenz Assignee: Xuefu Zhang Attachments: HIVE-3635.1.patch, HIVE-3635.2.patch, HIVE-3635.patch interpret t as true and f as false for boolean types. PostgreSQL exports represent it that way. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909018#comment-13909018 ] Hive QA commented on HIVE-6380: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12630225/HIVE-6380.4.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5175 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_insert_into6 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1437/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1437/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12630225 Specify jars/files when creating permanent UDFs --- Key: HIVE-6380 URL: https://issues.apache.org/jira/browse/HIVE-6380 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, HIVE-6380.4.patch Need a way for a permanent UDF to reference jars/files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909026#comment-13909026 ] Jason Dere commented on HIVE-6380: -- It looks like those 2 failures are due to HIVE-6479 and had been failing for the last several precommit tests. Specify jars/files when creating permanent UDFs --- Key: HIVE-6380 URL: https://issues.apache.org/jira/browse/HIVE-6380 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, HIVE-6380.4.patch Need a way for a permanent UDF to reference jars/files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6484) HiveServer2 doAs should be session aware both for secured and unsecured session implementation.
Vaibhav Gumashta created HIVE-6484: -- Summary: HiveServer2 doAs should be session aware both for secured and unsecured session implementation. Key: HIVE-6484 URL: https://issues.apache.org/jira/browse/HIVE-6484 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Currently in unsecured case, the doAs is performed by decorating TProcessor.process method. This has been causing cleanup issues as we end up creating a new clientUgi for each request rather than for each session. This also cleans up the code. [~thejas] Probably you can add more if you've seen other issues related to this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6380) Specify jars/files when creating permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6380: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Jason! Specify jars/files when creating permanent UDFs --- Key: HIVE-6380 URL: https://issues.apache.org/jira/browse/HIVE-6380 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.13.0 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, HIVE-6380.4.patch Need a way for a permanent UDF to reference jars/files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6484) HiveServer2 doAs should be session aware both for secured and unsecured session implementation.
[ https://issues.apache.org/jira/browse/HIVE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909041#comment-13909041 ] Thejas M Nair commented on HIVE-6484: - The FS instance leak described in HIVE-4501 can be fixed with this change. HiveServer2 doAs should be session aware both for secured and unsecured session implementation. --- Key: HIVE-6484 URL: https://issues.apache.org/jira/browse/HIVE-6484 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Currently in unsecured case, the doAs is performed by decorating TProcessor.process method. This has been causing cleanup issues as we end up creating a new clientUgi for each request rather than for each session. This also cleans up the code. [~thejas] Probably you can add more if you've seen other issues related to this. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6439) Introduce CBO step in Semantic Analyzer
[ https://issues.apache.org/jira/browse/HIVE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909040#comment-13909040 ] Ashutosh Chauhan commented on HIVE-6439: Is plan to get CBO optimizer work in 0.13 timeframe? If no, I think we may want to delay this till branching 0.13 Otherwise, this will go in 0.13 release with a config which doesn't do anything and thus will be confusing for end users. Introduce CBO step in Semantic Analyzer --- Key: HIVE-6439 URL: https://issues.apache.org/jira/browse/HIVE-6439 Project: Hive Issue Type: Sub-task Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6439.1.patch, HIVE-6439.2.patch, HIVE-6439.4.patch, HIVE-6439.5.patch This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. The contract between SemAly and CBO is: - CBO step is controlled by the 'hive.enable.cbo.flag'. - When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5176) Wincompat : Changes for allowing various path compatibilities with Windows
[ https://issues.apache.org/jira/browse/HIVE-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5176: - Attachment: HIVE-5176.3.patch patch v3. Pulled out many changes which look like they do the equivalent of HIVE-6343, will add a patch to that Jira. Patch should now be pretty equivalent to HIVE-4448. Wincompat : Changes for allowing various path compatibilities with Windows -- Key: HIVE-5176 URL: https://issues.apache.org/jira/browse/HIVE-5176 Project: Hive Issue Type: Sub-task Components: Windows Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5176.2.patch, HIVE-5176.3.patch, HIVE-5176.patch We need to make certain changes across the board to allow us to read/parse windows paths. Some are escaping changes, some are being strict about how we read paths (through URL.encode/decode, etc) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles
[ https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6434: - Attachment: HIVE-6434.3.patch rebased with trunk - patch v3 Restrict function create/drop to admin roles Key: HIVE-6434 URL: https://issues.apache.org/jira/browse/HIVE-6434 Project: Hive Issue Type: Sub-task Components: Authorization, UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles
[ https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6434: - Release Note: Restrict function create/drop to admin roles, if sql std auth is enabled. This would include temp/permanent functions, as well as macros. was: Restrict function create/drop to admin roles, if sql std auth is enabled. This would include temp/permanent functions, as well as macros. NO PRECOMMIT TESTS - dependent on HIVE-6330. Restrict function create/drop to admin roles Key: HIVE-6434 URL: https://issues.apache.org/jira/browse/HIVE-6434 Project: Hive Issue Type: Sub-task Components: Authorization, UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18162: HIVE-6434: Restrict function create/drop to admin roles
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18162/ --- (Updated Feb. 22, 2014, 12:54 a.m.) Review request for hive and Thejas Nair. Changes --- Only restrict create/drop of metastore functions. temp functions/macros not affected. Bugs: HIVE-6434 https://issues.apache.org/jira/browse/HIVE-6434 Repository: hive-git Description --- Add output entity of DB object to make sure only admin roles can add/drop functions/macros. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 68a25e0 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java 7dfd574 ql/src/test/queries/clientnegative/authorization_create_func1.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_create_func1.q PRE-CREATION ql/src/test/results/clientnegative/authorization_create_func1.q.out PRE-CREATION ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 393a3e8 ql/src/test/results/clientnegative/create_function_nonexistent_db.q.out ebb069e ql/src/test/results/clientnegative/create_function_nonudf_class.q.out dd66afc ql/src/test/results/clientnegative/udf_local_resource.q.out b6ea77d ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out ad70d54 ql/src/test/results/clientpositive/authorization_create_func1.q.out PRE-CREATION ql/src/test/results/clientpositive/create_func1.q.out 5a249c3 ql/src/test/results/clientpositive/udf_using.q.out 69e5f3b Diff: https://reviews.apache.org/r/18162/diff/ Testing --- positive/negative q files added Thanks, Jason Dere
[jira] [Commented] (HIVE-6434) Restrict function create/drop to admin roles
[ https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909068#comment-13909068 ] Jason Dere commented on HIVE-6434: -- patch v3 also only restricts create/drop of metastore functions Restrict function create/drop to admin roles Key: HIVE-6434 URL: https://issues.apache.org/jira/browse/HIVE-6434 Project: Hive Issue Type: Sub-task Components: Authorization, UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6434) Restrict function create/drop to admin roles
[ https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6434: - Status: Patch Available (was: Open) Restrict function create/drop to admin roles Key: HIVE-6434 URL: https://issues.apache.org/jira/browse/HIVE-6434 Project: Hive Issue Type: Sub-task Components: Authorization, UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6458) Add schema upgrade scripts for metastore changes related to permanent functions
[ https://issues.apache.org/jira/browse/HIVE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6458: - Attachment: HIVE-6458.1.patch Add schema upgrade scripts for metastore changes related to permanent functions --- Key: HIVE-6458 URL: https://issues.apache.org/jira/browse/HIVE-6458 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Attachments: HIVE-6458.1.patch Since HIVE-6330 has metastore changes, there need to be schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6439) Introduce CBO step in Semantic Analyzer
[ https://issues.apache.org/jira/browse/HIVE-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909075#comment-13909075 ] Harish Butani commented on HIVE-6439: - yes agreed, let's add this post hive 0.13 branching. Introduce CBO step in Semantic Analyzer --- Key: HIVE-6439 URL: https://issues.apache.org/jira/browse/HIVE-6439 Project: Hive Issue Type: Sub-task Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6439.1.patch, HIVE-6439.2.patch, HIVE-6439.4.patch, HIVE-6439.5.patch This patch introduces CBO step in SemanticAnalyzer. For now the CostBasedOptimizer is an empty shell. The contract between SemAly and CBO is: - CBO step is controlled by the 'hive.enable.cbo.flag'. - When true Hive SemAly will hand CBO a Hive Operator tree (with operators annotated with stats). If it can CBO will return a better plan in Hive AST form. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6458) Add schema upgrade scripts for metastore changes related to permanent functions
[ https://issues.apache.org/jira/browse/HIVE-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6458: - Assignee: Jason Dere Status: Patch Available (was: Open) Add schema upgrade scripts for metastore changes related to permanent functions --- Key: HIVE-6458 URL: https://issues.apache.org/jira/browse/HIVE-6458 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6458.1.patch Since HIVE-6330 has metastore changes, there need to be schema upgrade scripts. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6485) Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2
Vaibhav Gumashta created HIVE-6485: -- Summary: Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2 Key: HIVE-6485 URL: https://issues.apache.org/jira/browse/HIVE-6485 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Had upgraded to the new version while adding SSL over Http mode support for HiveServer2. But that conflicts with httpclient-4.2.5 which is in hadoop classpath. I don't have a good reason to use httpclient-4.3.2, so it's better to match hadoop. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Timeline for the Hive 0.13 release?
That's appropriate -- let the Hive release march forth on March 4th. -- Lefty On Fri, Feb 21, 2014 at 4:04 PM, Harish Butani hbut...@hortonworks.comwrote: Ok,let’s set it for March 4th . regards, Harish. On Feb 21, 2014, at 12:14 PM, Brock Noland br...@cloudera.com wrote: Might as well make it March 4th or 5th. Otherwise folks will burn weekend time to get patches in. On Fri, Feb 21, 2014 at 2:10 PM, Harish Butani hbut...@hortonworks.com wrote: Yes makes sense. How about we postpone the branching until 10am PST March 3rd, which is the following Monday. Don’t see a point of setting the branch time to a Friday evening. Do people agree? regards, Harish. On Feb 21, 2014, at 11:04 AM, Brock Noland br...@cloudera.com wrote: +1 On Fri, Feb 21, 2014 at 1:02 PM, Thejas Nair the...@hortonworks.com wrote: Can we wait for some few more days for the branching ? I have a few more security fixes that I would like to get in, and we also have a long pre-commit queue ahead right now. How about branching around Friday next week ? By then hadoop 2.3 should also be out as that vote has been concluded, and we can get HIVE-6037 in as well. -Thejas On Sun, Feb 16, 2014 at 5:32 PM, Brock Noland br...@cloudera.com wrote: I'd love to see HIVE-6037 in the 0.13 release. I have +1'ed it pending tests. Brock On Sun, Feb 16, 2014 at 7:23 PM, Navis류승우 navis@nexr.com wrote: HIVE-6037 is for generating hive-default.template file from HiveConf. Could it be included in this release? If it's not, I'll suspend further rebasing of it till next release (conflicts too frequently). 2014-02-16 20:38 GMT+09:00 Lefty Leverenz leftylever...@gmail.com : I'll try to catch up on the wikidocs backlog for 0.13.0 patches in time for the release. It's a long and growing list, though, so no promises. Feel free to do your own documentation, or hand it off to a friendly in-house writer. -- Lefty, self-appointed Hive docs maven On Sat, Feb 15, 2014 at 1:28 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good to me. On Fri, Feb 14, 2014 at 7:29 PM, Harish Butani hbut...@hortonworks.com wrote: Hi, Its mid feb. Wanted to check if the community is ready to cut a branch. Could we cut the branch in a week , say 5pm PST 2/21/14? The goal is to keep the release cycle short: couple of weeks; so after the branch we go into stabilizing mode for hive 0.13, checking in only blocker/critical bug fixes. regards, Harish. On Jan 20, 2014, at 9:25 AM, Brock Noland br...@cloudera.com wrote: Hi, I agree that picking a date to branch and then restricting commits to that branch would be a less time intensive plan for the RM. Brock On Sat, Jan 18, 2014 at 4:21 PM, Harish Butani hbut...@hortonworks.com wrote: Yes agree it is time to start planning for the next release. I would like to volunteer to do the release management duties for this release(will be a great experience for me) Will be happy to do it, if the community is fine with this. regards, Harish. On Jan 17, 2014, at 7:05 PM, Thejas Nair the...@hortonworks.com wrote: Yes, I think it is time to start planning for the next release. For 0.12 release I created a branch and then accepted patches that people asked to be included for sometime, before moving a phase of accepting only critical bug fixes. This turned out to be laborious. I think we should instead give everyone a few weeks to get any patches they are working on to be ready, cut the branch, and take in only critical bug fixes to the branch after that. How about cutting the branch around mid-February and targeting to release in a week or two after that. Thanks, Thejas On Fri, Jan 17, 2014 at 4:39 PM, Carl Steinbach c...@apache.org wrote: I was wondering what people think about setting a tentative date for the Hive 0.13 release? At an old Hive Contrib meeting we agreed that Hive should follow a time-based release model with new releases every four months. If we follow that schedule we're due for the next release in mid-February. Thoughts? Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is
Re: Review Request 18208: Support LDAP authentication for HiveServer2 in http mode
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18208/#review35224 --- service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java https://reviews.apache.org/r/18208/#comment65680 I assume these classes are not actually required for the LDAP changes. Lets include only the LDAP relevant changes to this jira. service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java https://reviews.apache.org/r/18208/#comment65676 It will be better to use AuthenticationProviderFactory.getAuthenticationProvider here. That way custom auth (and PAM) support also gets automatically added. service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java https://reviews.apache.org/r/18208/#comment65678 doLdapAuth returning username is not intuitive. I think it is better to pass username and password to the function. service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java https://reviews.apache.org/r/18208/#comment65677 This is not really an error in terms of the servers operation, ie does not affect the uptime of the server or other server problems. I think we should just use info level logs if client failed to authorize correctly. The client is the one that should get error messages for this. service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java https://reviews.apache.org/r/18208/#comment65679 I think we should just call this function with username and password as arguments. - Thejas Nair On Feb. 18, 2014, 10:29 a.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18208/ --- (Updated Feb. 18, 2014, 10:29 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-6350 https://issues.apache.org/jira/browse/HIVE-6350 Repository: hive-git Description --- Support LDAP authentication for HiveServer2 in http mode Diffs - service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java d8ba3aa service/src/java/org/apache/hive/service/auth/HttpAuthHelper.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpAuthenticationException.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpCLIServiceProcessor.java PRE-CREATION service/src/java/org/apache/hive/service/auth/HttpCLIServiceUGIProcessor.java PRE-CREATION service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java 5342214 service/src/java/org/apache/hive/service/cli/session/SessionManager.java bfe0e7b service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java a6ff6ce service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java e77f043 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 9e9a60d Diff: https://reviews.apache.org/r/18208/diff/ Testing --- Thanks, Vaibhav Gumashta
Review Request 18390: HS2 should return describe table results without space padding
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18390/ --- Review request for hive and Thejas Nair. Bugs: HIVE-4545 https://issues.apache.org/jira/browse/HIVE-4545 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-4545 Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 4df4dd5 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 29f1e57 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java 7fceb65 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java de788f7 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java b9be932 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 8173200 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 99b6d77 service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 445c858 Diff: https://reviews.apache.org/r/18390/diff/ Testing --- TestJdbcDriver2 Thanks, Vaibhav Gumashta
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-4545: --- Attachment: HIVE-4545.5.patch New Rb link: https://reviews.apache.org/r/18390/ (couldn't update the previous one as Thejas was the creator). This patch gets rid of the new config that was introduced in the previous patch (per [~hagleitn]'s feedback) by adding a way to detect whether the query is being served from HiveServer2. HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, HIVE-4545.4.patch, HIVE-4545.5.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-4545: --- Status: Patch Available (was: Open) HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, HIVE-4545.4.patch, HIVE-4545.5.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18390: HS2 should return describe table results without space padding
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18390/ --- (Updated Feb. 22, 2014, 2:22 a.m.) Review request for hive, Gunther Hagleitner and Thejas Nair. Bugs: HIVE-4545 https://issues.apache.org/jira/browse/HIVE-4545 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-4545 Diffs (updated) - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 4df4dd5 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 29f1e57 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/JsonMetaDataFormatter.java 7fceb65 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java de788f7 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatter.java b9be932 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 8173200 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 99b6d77 service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 445c858 Diff: https://reviews.apache.org/r/18390/diff/ Testing --- TestJdbcDriver2 Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-4545) HS2 should return describe table results without space padding
[ https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909164#comment-13909164 ] Vaibhav Gumashta commented on HIVE-4545: [~hagleitn] Whenever you get time, the jira is up for review. Thanks in advance! HS2 should return describe table results without space padding -- Key: HIVE-4545 URL: https://issues.apache.org/jira/browse/HIVE-4545 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch, HIVE-4545.4.patch, HIVE-4545.5.patch HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE FORMATTED table;'. HIVE-3140 introduced changes to not print header in 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for the 'DESCRIBE table;' query. As the jdbc/odbc results are not for direct human consumption the space padding should not be done for hive server2. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6343) Allow Hive client to load hdfs paths from hive aux jars
[ https://issues.apache.org/jira/browse/HIVE-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6343: - Attachment: HIVE-6343.1.patch Attaching patch v1. One thing I'm unsure about is how many times the remote file will get downloaded. I would assume that any local processes spun off by the client would attempt to load the jars in hive.aux.jars.path (and thus download remote jars again), Hadoop tasks should not try to load hive.aux.jars.path, since they use -libjars, right? Allow Hive client to load hdfs paths from hive aux jars --- Key: HIVE-6343 URL: https://issues.apache.org/jira/browse/HIVE-6343 Project: Hive Issue Type: Bug Components: Configuration Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6343.1.patch Hive client will add local aux jars to class loader but will ignore hdfs paths. We could have the client download hdfs files, similar to how ADD JAR does. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6084) WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511
[ https://issues.apache.org/jira/browse/HIVE-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909177#comment-13909177 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-6084: - Patch attached. The patch is based on the tests run over Hadoop 2. WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511 -- Key: HIVE-6084 URL: https://issues.apache.org/jira/browse/HIVE-6084 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Hari Sankar Sivarama Subramaniyan -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases
[ https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909178#comment-13909178 ] Sergey Shelukhin commented on HIVE-6429: Grafting this onto binarysortableserde will take a little bit of effort... will attach patch late evening today, or on Sunday evening MapJoinKey has large memory overhead in typical cases - Key: HIVE-6429 URL: https://issues.apache.org/jira/browse/HIVE-6429 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, HIVE-6429.03.patch, HIVE-6429.WIP.patch, HIVE-6429.patch The only thing that MJK really needs it hashCode and equals (well, and construction), so there's no need to have array of writables in there. Assuming all the keys for a table have the same structure, for the common case where keys are primitive types, we can store something like a byte array combination of keys to reduce the memory usage. Will probably speed up compares too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6084) WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511
[ https://issues.apache.org/jira/browse/HIVE-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-6084: Attachment: HIVE-6084.1.patch cc [~sushanth] and [~ekoifman] for review WebHCat TestStreaming_2 e2e test should return FAILURE after HIVE-5511 -- Key: HIVE-6084 URL: https://issues.apache.org/jira/browse/HIVE-6084 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6084.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6455: - Attachment: HIVE-6455.7.patch Added a fix that solved an issue with stats aggregation when stats aggregation key exceeds the max key prefix length. Fixed other failing tests. Scalable dynamic partitioning and bucketing optimization Key: HIVE-6455 URL: https://issues.apache.org/jira/browse/HIVE-6455 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: optimization Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch The current implementation of dynamic partition works by keeping at least one record writer open per dynamic partition directory. In case of bucketing there can be multispray file writers which further adds up to the number of open record writers. The record writers of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or compression buffers) open all the time to buffer up the rows and compress them before flushing it to disk. Since these buffers are maintained per column basis the amount of constant memory that will required at runtime increases as the number of partitions and number of columns per partition increases. This often leads to OutOfMemory (OOM) exception in mappers or reducers depending on the number of open record writers. Users often tune the JVM heapsize (runtime memory) to get over such OOM issues. With this optimization, the dynamic partition columns and bucketing columns (in case of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and bucketing columns are sorted, each reducers can keep only one record writer open at any time thereby reducing the memory pressure on the reducers. This optimization is highly scalable as the number of partition and number of columns per partition increases at the cost of sorting the columns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
Shivaraju Gowda created HIVE-6486: - Summary: Support secure Subject.doAs() in HiveServer2 JDBC client. Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Status: Patch Available (was: Open) Usage : Add identityContext=fromKerberosSubject in the URL to enable it Ex: jdbc:hive2://hive.example.com:1/default;principal=hive/localhost.localdom...@example.com;identityContext=fromKerberosSubject; Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Status: Open (was: Patch Available) Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda Attachments: Hive_011_Support-Subject_doAS.patch HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaraju Gowda updated HIVE-6486: -- Status: Patch Available (was: Open) The attached patch Hive_011_Support-Subject_doAS.patch contains a fix on top of Hive 0.11 head. To enable the feature Add identityContext=fromKerberosSubject in the JDBC URL Ex: jdbc:hive2://hive.example.com:1/default;principal=hive/localhost.localdom...@example.com;identityContext=fromKerberosSubject; The patch will affect only two jars. hive-jdbc-0.11.0.jar hive-service-0.11.0.jar Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.12.0, 0.11.0 Reporter: Shivaraju Gowda Attachments: Hive_011_Support-Subject_doAS.patch HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6486) Support secure Subject.doAs() in HiveServer2 JDBC client.
[ https://issues.apache.org/jira/browse/HIVE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909225#comment-13909225 ] Shivaraju Gowda commented on HIVE-6486: --- Other than the use case in the Description of this issue, the attached patch will also enhance Kerberos support in Hive JDBC driver by allowing the user to programmatically login to the kerberos(i.e without a key tab or ticket cache, etc.). Furthermore this is done without the dependency on the other component's jars(hadoop-core*.jar). Support secure Subject.doAs() in HiveServer2 JDBC client. - Key: HIVE-6486 URL: https://issues.apache.org/jira/browse/HIVE-6486 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Shivaraju Gowda Attachments: Hive_011_Support-Subject_doAS.patch HIVE-5155 addresses the problem of kerberos authentication in multi-user middleware server using proxy user. In this mode the principal used by the middle ware server has privileges to impersonate selected users in Hive/Hadoop. This enhancement is to support Subject.doAs() authentication in Hive JDBC layer so that the end users Kerberos Subject is passed through in the middle ware server. With this improvement there won't be any additional setup in the server to grant proxy privileges to some users and there won't be need to specify a proxy user in the JDBC client. This version should also be more secure since it won't require principals with the privileges to impersonate other users in Hive/Hadoop setup. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Precommit queue
Hi Brock, Do you know why the tests are taking almost twice as long in recent runs ? Is it related to the ec2 spot price spikes ? Thanks, Thejas On Fri, Feb 21, 2014 at 7:11 AM, Brock Noland br...@cloudera.com wrote: There was a ec2 spot price spike overnight which combined with everyone trying to get patches in for the branching has resulted in a massive queue: http://bigtop01.cloudera.org:8080/view/Hive/job/PreCommit-HIVE-Build/ ~25 builds in the queue Brock -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-6393) Support unqualified column references in Joining conditions
[ https://issues.apache.org/jira/browse/HIVE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909267#comment-13909267 ] Hive QA commented on HIVE-6393: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629966/HIVE-6393.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5180 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1438/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1438/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629966 Support unqualified column references in Joining conditions --- Key: HIVE-6393 URL: https://issues.apache.org/jira/browse/HIVE-6393 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6393.1.patch Support queries of the form: {noformat} create table r1(a int); create table r2(b); select a, b from r1 join r2 on a = b {noformat} This becomes more useful in old style syntax: {noformat} select a, b from r1, r2 where a = b {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6429) MapJoinKey has large memory overhead in typical cases
[ https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909268#comment-13909268 ] Lefty Leverenz commented on HIVE-6429: -- *hive.mapjoin.optimized.keys* needs a definition ... but I'm not sure where, because that depends on the state of HIVE-6037 which will put the config param definitions into HiveConf.java and then generate hive-default.xml.template from HiveConf.java. See comment on HIVE-6455 for details (but note that HIVE-6037 has been reopened): [17 Feb 2014 22:26 comment |https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903744#comment-13903744]. So if this commits before HIVE-6037, *hive.mapjoin.optimized.keys* should be documented in hive-default.xml.template as usual but if it commits after HIVE-6037 a definition should be added to the patched version of HiveConf.java. In any case, I'll add it to the wiki with a release note. MapJoinKey has large memory overhead in typical cases - Key: HIVE-6429 URL: https://issues.apache.org/jira/browse/HIVE-6429 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, HIVE-6429.03.patch, HIVE-6429.WIP.patch, HIVE-6429.patch The only thing that MJK really needs it hashCode and equals (well, and construction), so there's no need to have array of writables in there. Assuming all the keys for a table have the same structure, for the common case where keys are primitive types, we can store something like a byte array combination of keys to reduce the memory usage. Will probably speed up compares too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909274#comment-13909274 ] Lefty Leverenz commented on HIVE-6037: -- I'm keeping a list of new configuration parameters that were committed after the 2/17 patch or might be committed in time for 0.13.0. Here's the current list, not guaranteed to be complete: * HIVE-860 : hive.cache.runtime.jars * HIVE-6325: hive.server2.tez.default.queues, hive.server2.tez.sessions.per.default.queue, hive.server2.tez.initialize.default.sessions * HIVE-6382: hive.exec.orc.skip.corrupt.data (committed 2/20) * HIVE-6429: hive.mapjoin.optimized.keys * HIVE-6455: hive.optimize.sort.dynamic.partition Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, HIVE-6037.14.patch.txt, HIVE-6037.15.patch.txt, HIVE-6037.2.patch.txt, HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, HIVE-6037.patch see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18230: HIVE-6429 MapJoinKey has large memory overhead in typical cases
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18230/ --- (Updated Feb. 22, 2014, 7:34 a.m.) Review request for hive, Gunther Hagleitner and Jitendra Pandey. Repository: hive-git Description --- See JIRA Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 237b669 ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java 988cc57 ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 24f1229 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 46e37c2 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java c0f4cd7 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java 5cf347b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 61545b5 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 2ac0928 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyBytes.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKeyObject.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9ce0ae6 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 83ba0f0 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 47f9d21 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java 581046e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 2466a3b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorSMBMapJoinOperator.java 997202f ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java c541ad2 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java a103a51 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 61c5741 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 2cb1ac3 Diff: https://reviews.apache.org/r/18230/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Updated] (HIVE-6429) MapJoinKey has large memory overhead in typical cases
[ https://issues.apache.org/jira/browse/HIVE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6429: --- Attachment: HIVE-6429.04.patch for now address the other feedback... I will have separate patch to use BinarySortableSerDe, just need to hack around vectorized path, but I don't think it's worth it, it's convoluted and still has to keep type array and separate path for vectorization; there s also additional changes because for example hasAnyNulls would be complicated and expensive with BSSD format, so it has to be additionally retrieved at key creation time for the big table key in MJO. MapJoinKey has large memory overhead in typical cases - Key: HIVE-6429 URL: https://issues.apache.org/jira/browse/HIVE-6429 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6429.01.patch, HIVE-6429.02.patch, HIVE-6429.03.patch, HIVE-6429.04.patch, HIVE-6429.WIP.patch, HIVE-6429.patch The only thing that MJK really needs it hashCode and equals (well, and construction), so there's no need to have array of writables in there. Assuming all the keys for a table have the same structure, for the common case where keys are primitive types, we can store something like a byte array combination of keys to reduce the memory usage. Will probably speed up compares too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6380) Specify jars/files when creating permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909295#comment-13909295 ] Lefty Leverenz commented on HIVE-6380: -- This needs documentation. If you add a release note, I can put the information in the wiki. Or you can edit the wiki yourself, of course. In the example syntax (2nd comment) do the commas mean exclusive or? {code} CREATE FUNCTION udfname AS 'my.udf.class' USING JAR '/path/to/myjar.jar', FILE '/path/to/file', ARCHIVE '/path/to/archive.tgz'; {code} Doc locations in the wiki: * [CREATE FUNCTION syntax |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateFunction] * [Creating Custom UDFs |https://cwiki.apache.org/confluence/display/Hive/HivePlugins] Specify jars/files when creating permanent UDFs --- Key: HIVE-6380 URL: https://issues.apache.org/jira/browse/HIVE-6380 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Fix For: 0.13.0 Attachments: HIVE-6380.1.patch, HIVE-6380.2.patch, HIVE-6380.3.patch, HIVE-6380.4.patch Need a way for a permanent UDF to reference jars/files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)