Re: Plan: permanently move hive builds from bigtop
It looks great, thanks Lefty! On Sun, Apr 20, 2014 at 2:22 PM, Lefty Leverenz leftylever...@gmail.comwrote: Nice doc, Szehon. I did some minor editing so you might want to make sure I didn't introduce any errors. https://cwiki.apache.org/confluence/display/Hive/Hive+PTest2+Infrastructure -- Lefty On Sat, Apr 19, 2014 at 9:45 PM, Szehon Ho sze...@cloudera.com wrote: Migration is done, I updated the wiki to add all the details of the new setup: https://cwiki.apache.org/confluence/display/Hive/Hive+PTest2+Infrastructure New Jenkins URL to submit pre-commit jobs: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/ Again, this has to be done manually for time being, by clicking on 'build with parameters', and entering the issue number as a parameter. I've submitted some already. I'll reach out to some committers to get the auto-trigger working. As I mentioned, there is some work to fix the test-reporting, due to the framework using old url scheme. I am tracking it at HIVE-6937https://issues.apache.org/jira/browse/HIVE-6937. For now I am hosting log directory separately, if you want to see test logs, you have to manually go the url corresponding to your build, like: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/precommit-hive-11/for run#11. Sorry about that. Let me know if you see other issues, thanks! Szehon On Fri, Apr 18, 2014 at 2:11 PM, Thejas Nair the...@hortonworks.com wrote: Sounds good. Thanks Szehon! On Fri, Apr 18, 2014 at 10:17 AM, Ashutosh Chauhan hashut...@apache.org wrote: +1 Thanks Szehon! On Fri, Apr 18, 2014 at 6:29 AM, Xuefu Zhang xzh...@cloudera.com wrote: +1. Thanks for taking care of this. On Thu, Apr 17, 2014 at 11:00 PM, Szehon Ho sze...@cloudera.com wrote: Hi, This week the machine running Hive builds at http://bigtop01.cloudera.org:8080/view/Hive/? ran out of space, so new jobs like Precommit tests stopped. Its still not resolved there, there was another email today on Bigtop list, but there's very few people with root access to that host, and they still haven't responded. I chatted with Brock, he has also seen various issues with Bigtop jenkins in the past, so I am thinking to move the Jenkins jobs to the PTest master itself, where some PMC already have access and can admin if needed. Currently I am hosting the pre-commit Jenkins job on my own EC2 instance as stop-gap. Other advantages of hosting our own Jenkins: 1. No need to wait for other Bigtop jobs to run. 2. Bigtop is using a version of Jenkins that doesnt show parameters like JIRA number for queued jobs, so impossible to tell whether a patch got picked up and where it is in queue. 3. Eliminate network hop from Bigtop box to our PTest master. The disadvantage is: 1. We don't have much experience doing Jenkins admin, but it doesn't look too bad. Mostly, restart if there's issue and clean up if out of space. I wonder what people think, and if there's any objections to this? If not, I'll try setting up this weekend. Then, there is some follow-up work, like changing the Jenkins url's displayed in the test report. Thanks! Szehon -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Pre commit tests on Hadoop-2
As per discussion with Ashutosh and Brock, its also decided to make the new PreCommit build use Hadoop-2 profile, as future deveopments will be focused on there, and its better to do it quickly now rather than let the hadoop-2 q.out files drift again. Thanks to all who helped fix the hundreds of hadoop-2 test failures, but when I ran last night there were still 43 failed tests. So you may see these failures in your Precommit results, let's see if we can tackle these quickly. Thanks! Szehon 2014-04-20 06:31:40,768 WARN PTest.run:202 43 failed tests 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 2014-04-20 06:31:40,768 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database 2014-04-20 06:31:40,769 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_17 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19 2014-04-20 06:31:40,770 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9 2014-04-20 06:31:40,771 WARN PTest.run:205 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 2014-04-20
[jira] [Reopened] (HIVE-6936) Provide table properties to InputFormats
[ https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reopened HIVE-6936: - Provide table properties to InputFormats Key: HIVE-6936 URL: https://issues.apache.org/jira/browse/HIVE-6936 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.14.0 Attachments: HIVE-6936.patch Some advanced file formats need the table properties made available to them. Additionally, it would be convenient to provide a unique id for fetch operators and the complete list of directories. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HIVE-6936) Provide table properties to InputFormats
[ https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-6936: Comment: was deleted (was: Pushed compilation fix to condorM30-0.11.0 as 41089f18.) Provide table properties to InputFormats Key: HIVE-6936 URL: https://issues.apache.org/jira/browse/HIVE-6936 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.14.0 Attachments: HIVE-6936.patch Some advanced file formats need the table properties made available to them. Additionally, it would be convenient to provide a unique id for fetch operators and the complete list of directories. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6936) Provide table properties to InputFormats
[ https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975629#comment-13975629 ] Owen O'Malley commented on HIVE-6936: - Sorry, I closed the wrong bug! Provide table properties to InputFormats Key: HIVE-6936 URL: https://issues.apache.org/jira/browse/HIVE-6936 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.14.0 Attachments: HIVE-6936.patch Some advanced file formats need the table properties made available to them. Additionally, it would be convenient to provide a unique id for fetch operators and the complete list of directories. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6938) Add Support for Parquet Column Rename
Daniel Weeks created HIVE-6938: -- Summary: Add Support for Parquet Column Rename Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated HIVE-6938: --- Status: Patch Available (was: Open) The patch contains a small change to DDLTask to add support for replace columns as well as a change to the Serde to allow switching between column index based access and name based access of columns. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated HIVE-6938: --- Attachment: HIVE-6938.1.patch Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6807) add HCatStorer ORC test to test missing columns
[ https://issues.apache.org/jira/browse/HIVE-6807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-6807: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Thanks Eugene for correcting my misread of the patch. Patch checked in. add HCatStorer ORC test to test missing columns --- Key: HIVE-6807 URL: https://issues.apache.org/jira/browse/HIVE-6807 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-6807.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975734#comment-13975734 ] Julien Le Dem commented on HIVE-6938: - I find the terminology columnar.access confusing but otherwise, this looks good to me. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6927) Add support for MSSQL in schematool
[ https://issues.apache.org/jira/browse/HIVE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975747#comment-13975747 ] Ashutosh Chauhan commented on HIVE-6927: +1 Add support for MSSQL in schematool --- Key: HIVE-6927 URL: https://issues.apache.org/jira/browse/HIVE-6927 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Attachments: HIVE-6927.patch Schematool is the preferred way of initializing schema for Hive. Since HIVE-6862 provided the script for MSSQL it would be nice to add the support for it in schematool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975746#comment-13975746 ] Daniel Weeks commented on HIVE-6938: Confusion is understandable considering parquet is columnar. How about column.index.access? I'll update the patch. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing
[ https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975767#comment-13975767 ] Nick White commented on HIVE-538: - [~ashutoshc] not really, it manually lists some dependencies (not the transitive ones) instead of using maven to work them out, and creates a tar.gz of many jars, not a single jar with all the dependencies in. A tar.gz can't easily integrate with maven; it's easy to add this complete jar as a dependency to a third-party maven project as it's published with a distinct classifier. make hive_jdbc.jar self-containing -- Key: HIVE-538 URL: https://issues.apache.org/jira/browse/HIVE-538 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0 Reporter: Raghotham Murthy Assignee: Nick White Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies: 1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro 2. dont use hadoop configuration parameters 3. repackage thrift and fb303 classes into hive_jdbc.jar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing
[ https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975771#comment-13975771 ] Nick White commented on HIVE-538: - also, duplicating hive-jdbc's dependencies in an xml file in a different project will increase maintenance costs, as these two lists will have to be manually kept in sync. make hive_jdbc.jar self-containing -- Key: HIVE-538 URL: https://issues.apache.org/jira/browse/HIVE-538 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0 Reporter: Raghotham Murthy Assignee: Nick White Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies: 1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro 2. dont use hadoop configuration parameters 3. repackage thrift and fb303 classes into hive_jdbc.jar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6916) Export/import inherit permissions from parent directory
[ https://issues.apache.org/jira/browse/HIVE-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6916: -- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks to Szehon for the patch. Export/import inherit permissions from parent directory --- Key: HIVE-6916 URL: https://issues.apache.org/jira/browse/HIVE-6916 Project: Hive Issue Type: Bug Components: Security Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 0.14.0 Attachments: HIVE-6916.2.patch, HIVE-6916.patch Export table into an external location and importing into hive, should set the table to have the permission of the parent directory, if the flag hive.warehouse.subdir.inherit.perms is set. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975809#comment-13975809 ] Xuefu Zhang commented on HIVE-6411: --- I believe there are some minor items to be resolved on review board. Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975824#comment-13975824 ] Xuefu Zhang commented on HIVE-6835: --- I think that's pretty much what you need to do. While #2 may touch many files, it's fairly safe as #1 guarantees that the same code will be exercised. There isn't much API change. You add one with default implementation and deprecate the old one. In #3, you have both property sets and do whatever you need for Avro. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975851#comment-13975851 ] Julien Le Dem commented on HIVE-6938: - Sounds good to me! Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975889#comment-13975889 ] Eugene Koifman commented on HIVE-5072: -- [~shuainie] please file the 2 follow up tickets for doc and version/sqoop issues. [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: WebHCat Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2
Jason Dere created HIVE-6939: Summary: TestExecDriver.testMapRedPlan3 fails on hadoop-2 Key: HIVE-6939 URL: https://issues.apache.org/jira/browse/HIVE-6939 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Passes on hadoop-1, but fails on hadoop-2. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
Shuaishuai Nie created HIVE-6940: Summary: [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Shuaishuai Nie WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6941) [WebHCat] Implement webhcat endpoint version/sqoop
Shuaishuai Nie created HIVE-6941: Summary: [WebHCat] Implement webhcat endpoint version/sqoop Key: HIVE-6941 URL: https://issues.apache.org/jira/browse/HIVE-6941 Project: Hive Issue Type: Improvement Components: WebHCat Reporter: Shuaishuai Nie Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6901: -- Attachment: HIVE-6901.2.patch Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975941#comment-13975941 ] Daniel Weeks commented on HIVE-6938: Patch #2 has the disambiguated property name. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated HIVE-6938: --- Attachment: HIVE-6938.2.patch Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975950#comment-13975950 ] Hive QA commented on HIVE-6901: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12641106/HIVE-6901.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/17/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/precommit-hive/17/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: NullPointerException: driver {noformat} This message is automatically generated. ATTACHMENT ID: 12641106 Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6942) Explanation of GROUPING__ID is confusing
chris schrader created HIVE-6942: Summary: Explanation of GROUPING__ID is confusing Key: HIVE-6942 URL: https://issues.apache.org/jira/browse/HIVE-6942 Project: Hive Issue Type: Improvement Components: Documentation Reporter: chris schrader Priority: Minor The explanation given for GROUPING__ID in enhanced aggregations is very incomplete and confusing based on the example. Documentation here: https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup#EnhancedAggregation,Cube,GroupingandRollup-Grouping__IDfunction It would be far easier to understand if the bit vector were explained better along side the examples given. IE, also explain identifying each column in terms of the binary number it returns and then show it converted to decimal. In the examples provided, the binary equivalent of the grouping ID's for the first example would be 1,11,11 representing the columns included in aggregation. The documentation is very confusing without this clear connection between creating a binary number that gets converted (just referring to it as a bitvector isn't sufficient to the average user). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975990#comment-13975990 ] Jason Dere commented on HIVE-6939: -- It appears that on hadoop-2 there are multiple output files for this test, whereas on hadoop-1 there is just a single file. Also the test looks like it assumes there is just a single file to compare to. This test for some reason is specifying 5 reducers. I've been told that hadoop-1 Hive did not obey the number of reducers, while hadoop-2 does. This could explain why this test works for hadoop-1 since if it only ever used 1 reducer and generated a single output file. Changing this test to use just a single reducer allows the test to pass with hadoop-2. Does anyone know the history of this test and why it was set to use 5 reducers? TestExecDriver.testMapRedPlan3 fails on hadoop-2 Key: HIVE-6939 URL: https://issues.apache.org/jira/browse/HIVE-6939 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Passes on hadoop-1, but fails on hadoop-2. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6901: Attachment: HIVE-6901.2.patch Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6939: - Status: Patch Available (was: Open) TestExecDriver.testMapRedPlan3 fails on hadoop-2 Key: HIVE-6939 URL: https://issues.apache.org/jira/browse/HIVE-6939 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6939.1.patch Passes on hadoop-1, but fails on hadoop-2. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6939: - Attachment: HIVE-6939.1.patch Patch changes test to use single reducer so that there is just a single output file. TestExecDriver.testMapRedPlan3 fails on hadoop-2 Key: HIVE-6939 URL: https://issues.apache.org/jira/browse/HIVE-6939 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6939.1.patch Passes on hadoop-1, but fails on hadoop-2. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975991#comment-13975991 ] Szehon Ho commented on HIVE-6901: - Sorry about that, there was a problem with the hadoop-2 test-property file on the build machine, I'll re-submit this. Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.2.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6541) Need to write documentation for ACID work
[ https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976014#comment-13976014 ] Alan Gates commented on HIVE-6541: -- I've added page links for [Hive Transactions|https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions] and [Streaming Ingest|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] and linked the Hive Transactions page in the User docs section of the wiki home page. Feel free to edit or move these docs around if you think they would be better placed somewhere else. Need to write documentation for ACID work - Key: HIVE-6541 URL: https://issues.apache.org/jira/browse/HIVE-6541 Project: Hive Issue Type: Sub-task Components: Documentation Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: hive-6541-changesAfterFirstEdit.rtf, hive-6541-firstEdit.rtf, hive-6541.txt ACID introduces a number of new config file options, tables in the metastore, keywords in the grammar, and a new interface for use of tools like storm and flume. These need to be documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6541) Need to write documentation for ACID work
[ https://issues.apache.org/jira/browse/HIVE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates resolved HIVE-6541. -- Resolution: Fixed Need to write documentation for ACID work - Key: HIVE-6541 URL: https://issues.apache.org/jira/browse/HIVE-6541 Project: Hive Issue Type: Sub-task Components: Documentation Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: hive-6541-changesAfterFirstEdit.rtf, hive-6541-firstEdit.rtf, hive-6541.txt ACID introduces a number of new config file options, tables in the metastore, keywords in the grammar, and a new interface for use of tools like storm and flume. These need to be documented. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6943: --- Summary: TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2 (was: TestMinimrCliDriver.testCliDriver_root_dir_external_table) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2 --- Key: HIVE-6943 URL: https://issues.apache.org/jira/browse/HIVE-6943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ashutosh Chauhan Seems like this test passes for hadoop-1 but is flaky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table
Ashutosh Chauhan created HIVE-6943: -- Summary: TestMinimrCliDriver.testCliDriver_root_dir_external_table Key: HIVE-6943 URL: https://issues.apache.org/jira/browse/HIVE-6943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ashutosh Chauhan Seems like this test passes for hadoop-1 but is flaky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976064#comment-13976064 ] Ashutosh Chauhan commented on HIVE-6943: Stack trace I got is: {code} Error: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:302) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.init(HadoopShimsSecure.java:249) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:363) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:591) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:168) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:288) ... 11 more Caused by: java.io.FileNotFoundException: Path is not a file: /Users at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:51) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1627) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1550) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1524) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:476) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:289) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1133) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1121) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:272) at
[jira] [Commented] (HIVE-6924) MapJoinKeyBytes::hashCode() should use Murmur hash
[ https://issues.apache.org/jira/browse/HIVE-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976081#comment-13976081 ] Jitendra Nath Pandey commented on HIVE-6924: VectorHashKeyWrapper also uses Arrays.hashCode(). VectorHashKeyWrapper is used in VectorGroupByOperator for keys to the aggregates. Murmur hash should help there as well. MapJoinKeyBytes::hashCode() should use Murmur hash -- Key: HIVE-6924 URL: https://issues.apache.org/jira/browse/HIVE-6924 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6924.patch Existing hashCode is bad, causes HashMap to cluster -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
Eugene Koifman created HIVE-6944: Summary: WebHCat e2e tests broken by HIVE-6432 Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6944: - Description: HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS was:HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6944: - Attachment: HIVE-6944.patch WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6944.patch HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6944: - Status: Patch Available (was: Open) WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6944.patch HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4576) templeton.hive.properties does not allow values with commas
[ https://issues.apache.org/jira/browse/HIVE-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4576: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Eugene! templeton.hive.properties does not allow values with commas --- Key: HIVE-4576 URL: https://issues.apache.org/jira/browse/HIVE-4576 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.5.0 Reporter: Vitaliy Fuks Assignee: Eugene Koifman Priority: Minor Fix For: 0.14.0 Attachments: HIVE-4576.2.patch, HIVE-4576.patch templeton.hive.properties accepts a comma-separated list of key=value property pairs that will be passed to Hive. However, this makes it impossible to use any value that itself has a comma in it. For example: {code:xml}property nametempleton.hive.properties/name valuehive.metastore.sasl.enabled=false,hive.metastore.uris=thrift://foo1.example.com:9083,foo2.example.com:9083/value /property{code} {noformat}templeton: starting [/usr/bin/hive, --service, cli, --hiveconf, hive.metastore.sasl.enabled=false, --hiveconf, hive.metastore.uris=thrift://foo1.example.com:9083, --hiveconf, foo2.example.com:9083 etc..{noformat} because the value is parsed using standard org.apache.hadoop.conf.Configuration.getStrings() call which simply splits on commas from here: {code:java}for (String prop : appConf.getStrings(AppConfig.HIVE_PROPS_NAME)){code} This is problematic for any hive property that itself has multiple values, such as hive.metastore.uris above or hive.aux.jars.path. There should be some way to escape commas or a different delimiter should be used. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976101#comment-13976101 ] Jason Dere commented on HIVE-6943: -- Had already opened HIVE-6401 for this one - looks like hadoop-2 is returning back directories during getSplits() TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2 --- Key: HIVE-6943 URL: https://issues.apache.org/jira/browse/HIVE-6943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ashutosh Chauhan Seems like this test passes for hadoop-1 but is flaky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6943) TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-6943. Resolution: Duplicate Oh, missed that one. Resolving this one as dupe of HIVE-6401 TestMinimrCliDriver.testCliDriver_root_dir_external_table fails on hadoop-2 --- Key: HIVE-6943 URL: https://issues.apache.org/jira/browse/HIVE-6943 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ashutosh Chauhan Seems like this test passes for hadoop-1 but is flaky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6945) issues with dropping partitions on Oracle
Sergey Shelukhin created HIVE-6945: -- Summary: issues with dropping partitions on Oracle Key: HIVE-6945 URL: https://issues.apache.org/jira/browse/HIVE-6945 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.2#6252)
[ANNOUNCE] Apache Hive 0.13.0 Released
The Apache Hive team is proud to announce the the release of Apache Hive version 0.13.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extract/transform/load (ETL) * A mechanism to impose structure on a variety of data formats * Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM) * Query execution via MapReduce For Hive release details and downloads, please visit: http://www.apache.org/dyn/closer.cgi/hive/ Hive 0.13.0 Release Notes are available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843 We would like to thank the many contributors who made this release possible. Regards, The Apache Hive Team PS: we are having technical difficulty updating the website. Will resolve this shortly.
[jira] [Commented] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976144#comment-13976144 ] Anthony Hsu commented on HIVE-6835: --- Great, sounds like we're on the same page. I'll implement this new approach and upload a new patch soon. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch, HIVE-6835.2.patch, HIVE-6835.3.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6940: - Component/s: Documentation [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6940: - Affects Version/s: 0.14.0 [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976163#comment-13976163 ] Eugene Koifman commented on HIVE-5072: -- Verified that the tests run. [~shuainie] Thank you for filing the tickets but I think they need to have more than 1 line description. HIVE-6541 has an example of what to provide to Documentation writers. [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: WebHCat Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6945) issues with dropping partitions on Oracle
[ https://issues.apache.org/jira/browse/HIVE-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976162#comment-13976162 ] Xuefu Zhang commented on HIVE-6945: --- Could we have some description about what issues are in focus here? The title alone doesn't seem providing any essential information that help the readers. issues with dropping partitions on Oracle - Key: HIVE-6945 URL: https://issues.apache.org/jira/browse/HIVE-6945 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [ANNOUNCE] Apache Hive 0.13.0 Released
Thanks to Harish for all the hard work managing and getting the release out! This is great news! This is a significant release in hive! This has more than twice the number of jiras included (see release note link), compared to 0.12, and earlier releases which were also out after a similar gap of 5-6 months. It shows tremendous growth in hive community activity! hive 0.13 - 1081 hive 0.12 - 439 hive 0.11 - 374 -Thejas On Mon, Apr 21, 2014 at 3:17 PM, Harish Butani rhbut...@apache.org wrote: The Apache Hive team is proud to announce the the release of Apache Hive version 0.13.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extract/transform/load (ETL) * A mechanism to impose structure on a variety of data formats * Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM) * Query execution via MapReduce For Hive release details and downloads, please visit: http://www.apache.org/dyn/closer.cgi/hive/ Hive 0.13.0 Release Notes are available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843 We would like to thank the many contributors who made this release possible. Regards, The Apache Hive Team PS: we are having technical difficulty updating the website. Will resolve this shortly. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976176#comment-13976176 ] Sushanth Sowmyan commented on HIVE-6944: +1 , will commit after the 24h period. :) WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6944.patch HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6946) Make it easier to run WebHCat tesst
Eugene Koifman created HIVE-6946: Summary: Make it easier to run WebHCat tesst Key: HIVE-6946 URL: https://issues.apache.org/jira/browse/HIVE-6946 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6946: - Summary: Make it easier to run WebHCat e2e tests (was: Make it easier to run WebHCat tesst) Make it easier to run WebHCat e2e tests --- Key: HIVE-6946 URL: https://issues.apache.org/jira/browse/HIVE-6946 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-4824) make TestWebHCatE2e run w/o requiring installing external hadoop
[ https://issues.apache.org/jira/browse/HIVE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman resolved HIVE-4824. -- Resolution: Won't Fix real unit tests for webhcat would take too much effort so will instead simplify running existing e2e tests in HIVE-6946 make TestWebHCatE2e run w/o requiring installing external hadoop Key: HIVE-4824 URL: https://issues.apache.org/jira/browse/HIVE-4824 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Currently WebHCat will use hive/build/dist/hcatalog/bin/hcat to execute DDL commands, which in turn uses Hadoop Jar command. This in turn requires that HADOOP_HOME env var be defined and point to an existing Hadoop install. Need to see we can apply hive/testutils/hadoop idea here to make WebHCat not depend on external hadoop. This will make Unit tests better/easier to write and make dev/test cycle simpler. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [ANNOUNCE] Apache Hive 0.13.0 Released
The link to the Release Notes is wrong. Thanks Szehon Ho for pointing this out. The correct link is: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324986styleName=TextprojectId=12310843 On Mon, Apr 21, 2014 at 4:23 PM, Thejas Nair the...@hortonworks.com wrote: Thanks to Harish for all the hard work managing and getting the release out! This is great news! This is a significant release in hive! This has more than twice the number of jiras included (see release note link), compared to 0.12, and earlier releases which were also out after a similar gap of 5-6 months. It shows tremendous growth in hive community activity! hive 0.13 - 1081 hive 0.12 - 439 hive 0.11 - 374 -Thejas On Mon, Apr 21, 2014 at 3:17 PM, Harish Butani rhbut...@apache.org wrote: The Apache Hive team is proud to announce the the release of Apache Hive version 0.13.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extract/transform/load (ETL) * A mechanism to impose structure on a variety of data formats * Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM) * Query execution via MapReduce For Hive release details and downloads, please visit: http://www.apache.org/dyn/closer.cgi/hive/ Hive 0.13.0 Release Notes are available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12324312styleName=TextprojectId=12310843 We would like to thank the many contributors who made this release possible. Regards, The Apache Hive Team PS: we are having technical difficulty updating the website. Will resolve this shortly. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HIVE-5538) Turn on vectorization by default.
[ https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5538: --- Status: Open (was: Patch Available) Turn on vectorization by default. - Key: HIVE-5538 URL: https://issues.apache.org/jira/browse/HIVE-5538 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch Vectorization should be turned on by default, so that users don't have to specifically enable vectorization. Vectorization code validates and ensures that a query falls back to row mode if it is not supported on vectorized code path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5538) Turn on vectorization by default.
[ https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5538: --- Attachment: HIVE-5538.3.patch Many tests failed due to difference in explain output, which were due to data size stats. The attached patch fixes it. Turn on vectorization by default. - Key: HIVE-5538 URL: https://issues.apache.org/jira/browse/HIVE-5538 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch Vectorization should be turned on by default, so that users don't have to specifically enable vectorization. Vectorization code validates and ensures that a query falls back to row mode if it is not supported on vectorized code path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5538) Turn on vectorization by default.
[ https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5538: --- Status: Patch Available (was: Open) Turn on vectorization by default. - Key: HIVE-5538 URL: https://issues.apache.org/jira/browse/HIVE-5538 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch Vectorization should be turned on by default, so that users don't have to specifically enable vectorization. Vectorization code validates and ensures that a query falls back to row mode if it is not supported on vectorized code path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976233#comment-13976233 ] Hive QA commented on HIVE-6901: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12641113/HIVE-6901.2.patch {color:red}ERROR:{color} -1 due to 122 failed/errored test(s), 5416 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binarysortable_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullformatCTAS org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6947: --- Summary: More fixes for tests on hadoop-2 (was: More fixes for hadoop-2) More fixes for tests on hadoop-2 - Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6947) More fixes for hadoop-2
Ashutosh Chauhan created HIVE-6947: -- Summary: More fixes for hadoop-2 Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6947: --- Attachment: HIVE-6947.patch Doesnt include file size change diffs. More fixes for tests on hadoop-2 - Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Attachments: HIVE-6947.patch Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HIVE-6947) More fixes for tests on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6947 started by Ashutosh Chauhan. More fixes for tests on hadoop-2 - Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6947.patch Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6947) More fixes for tests on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6947: --- Status: Patch Available (was: In Progress) More fixes for tests on hadoop-2 - Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6947.patch Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6947) More fixes for tests on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-6947: -- Assignee: Ashutosh Chauhan More fixes for tests on hadoop-2 - Key: HIVE-6947 URL: https://issues.apache.org/jira/browse/HIVE-6947 Project: Hive Issue Type: Bug Components: Tests Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6947.patch Few more fixes for test cases on hadoop-2 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6946: - Description: Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. NO PRECOMMIT TESTS was:Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. Make it easier to run WebHCat e2e tests --- Key: HIVE-6946 URL: https://issues.apache.org/jira/browse/HIVE-6946 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6927) Add support for MSSQL in schematool
[ https://issues.apache.org/jira/browse/HIVE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6927: --- Status: Patch Available (was: Open) Add support for MSSQL in schematool --- Key: HIVE-6927 URL: https://issues.apache.org/jira/browse/HIVE-6927 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Attachments: HIVE-6927.patch Schematool is the preferred way of initializing schema for Hive. Since HIVE-6862 provided the script for MSSQL it would be nice to add the support for it in schematool. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6932) hive README needs update
[ https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6932: - Assignee: Thejas M Nair hive README needs update Key: HIVE-6932 URL: https://issues.apache.org/jira/browse/HIVE-6932 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6932) hive README needs update
[ https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6932: Description: It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. NO PRECOMMIT TESTS was: It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. hive README needs update Key: HIVE-6932 URL: https://issues.apache.org/jira/browse/HIVE-6932 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6932.1.patch It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6932) hive README needs update
[ https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6932: Attachment: HIVE-6932.1.patch hive README needs update Key: HIVE-6932 URL: https://issues.apache.org/jira/browse/HIVE-6932 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6932.1.patch It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6932) hive README needs update
[ https://issues.apache.org/jira/browse/HIVE-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6932: Status: Patch Available (was: Open) hive README needs update Key: HIVE-6932 URL: https://issues.apache.org/jira/browse/HIVE-6932 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6932.1.patch It needs to be updated to include Tez as a runtime. Also, it talks about average latency being in minutes, which is very misleading. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6939) TestExecDriver.testMapRedPlan3 fails on hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976271#comment-13976271 ] Ashutosh Chauhan commented on HIVE-6939: Description of test says test reduce with multiple tagged inputs. so I dont think it has any specific intention for # of reducers = 5. So, # of reducers = 1, sounds good to me. +1 TestExecDriver.testMapRedPlan3 fails on hadoop-2 Key: HIVE-6939 URL: https://issues.apache.org/jira/browse/HIVE-6939 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6939.1.patch Passes on hadoop-1, but fails on hadoop-2. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6946: - Status: Patch Available (was: Open) Make it easier to run WebHCat e2e tests --- Key: HIVE-6946 URL: https://issues.apache.org/jira/browse/HIVE-6946 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6946.patch Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6946) Make it easier to run WebHCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-6946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6946: - Attachment: HIVE-6946.patch Make it easier to run WebHCat e2e tests --- Key: HIVE-6946 URL: https://issues.apache.org/jira/browse/HIVE-6946 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6946.patch Right now hcatalog/src/test/e2e/templeton/README.txt explains the steps to set up WebHCat e2e tests but it's cumbersome and error prone. Need to make some improvements here. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6469) skipTrash option in hive command line
[ https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976297#comment-13976297 ] Jayesh commented on HIVE-6469: -- Xuefu, This is really miner convenient feature which definitely has a use-case for our enterprise customer. are you suggesting providing this feature via hive configuration that works in following way ? set hive.warehouse.data.skipTrash = true-- explicitly set drop table large10TBTable -- this will skip trash drop table anyOtherTable -- this will skip trash set hive.warehouse.data.skipTrash = false -- if you forget this, it will skipTrash forever, until corrected. drop table regularTable -- this will start placing data in trash I believe that approach is not very intuitive and will lead to human error that creates disaster if necessary steps are not done, which ultimately violates hive feature of providing trash as backup. Also, different environment with different HS2 instance may not be the scenario here. This has proven to be very helpful on same environment by different users. Also, I dont think this pollute SQL Syntax, think of this as PURGE option in Oracle DB and hence I totally see use it being used by enterprise customer. http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9003.htm Did you get a chance to look at the links I put earlier, where people seen to be searching for this little convenient feature ? Also did you get a chance to talk to any customers who would like such feature? Please let us know. Thanks Jayesh skipTrash option in hive command line - Key: HIVE-6469 URL: https://issues.apache.org/jira/browse/HIVE-6469 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.12.0 Reporter: Jayesh Fix For: 0.12.1 Attachments: HIVE-6469.patch hive drop table command deletes the data from HDFS warehouse and puts it into Trash. Currently there is no way to provide flag to tell warehouse to skip trash while deleting table data. This ticket is to add skipTrash feature in hive command-line, that looks as following. hive -e drop table skipTrash testTable This would be good feature to add, so that user can specify when not to put data into trash directory and thus not to fill hdfs space instead of relying on trash interval and policy configuration to take care of disk filling issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
Hive Contributor
Dear Hive PMC, I would like to contribute to the HIVE community. Could you please grant me the contributor role? My apache username is ngangam. Thank you in advance and I am looking forward to becoming a part of the Hive community. -- Thanks, Naveen :)
[jira] [Created] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
Peng Zhang created HIVE-6948: Summary: HiveServer2 doesn't respect HIVE_AUX_JARS_PATH Key: HIVE-6948 URL: https://issues.apache.org/jira/browse/HIVE-6948 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Peng Zhang Fix For: 0.13.0 HiveServer2 ignores HIVE_AUX_JARS_PATH. This will cause aux jars not distributed to Yarn cluster, and job will fail without dependent jars. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
[ https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Zhang updated HIVE-6948: - Attachment: HIVE-6948.patch HiveServer2 doesn't respect HIVE_AUX_JARS_PATH -- Key: HIVE-6948 URL: https://issues.apache.org/jira/browse/HIVE-6948 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Peng Zhang Fix For: 0.13.0 Attachments: HIVE-6948.patch HiveServer2 ignores HIVE_AUX_JARS_PATH. This will cause aux jars not distributed to Yarn cluster, and job will fail without dependent jars. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6948) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH
[ https://issues.apache.org/jira/browse/HIVE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Zhang updated HIVE-6948: - Status: Patch Available (was: Open) HiveServer2 doesn't respect HIVE_AUX_JARS_PATH -- Key: HIVE-6948 URL: https://issues.apache.org/jira/browse/HIVE-6948 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Peng Zhang Fix For: 0.13.0 Attachments: HIVE-6948.patch HiveServer2 ignores HIVE_AUX_JARS_PATH. This will cause aux jars not distributed to Yarn cluster, and job will fail without dependent jars. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6941) [WebHCat] Implement webhcat endpoint version/sqoop
[ https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-6941: - Description: Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. (was: Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used.) [WebHCat] Implement webhcat endpoint version/sqoop Key: HIVE-6941 URL: https://issues.apache.org/jira/browse/HIVE-6941 Project: Hive Issue Type: Improvement Components: WebHCat Reporter: Shuaishuai Nie Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop
[ https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-6941: - Summary: [WebHCat] Complete implementation of webhcat endpoint version/sqoop (was: [WebHCat] Implement webhcat endpoint version/sqoop) [WebHCat] Complete implementation of webhcat endpoint version/sqoop - Key: HIVE-6941 URL: https://issues.apache.org/jira/browse/HIVE-6941 Project: Hive Issue Type: Improvement Components: WebHCat Reporter: Shuaishuai Nie Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop
[ https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-6941: - Description: Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop interactive. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. The reason is we cannot simply do the same as the endpoint version/hive or version/hadoop since WebHCat does not have dependency with Sqoop. Currently Sqoop 1 support getting the version using command sqoop version. WebHCat can invoke this command using templeton/v1/sqoop endpoint but this is not interactive. (was: Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop used. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501.) [WebHCat] Complete implementation of webhcat endpoint version/sqoop - Key: HIVE-6941 URL: https://issues.apache.org/jira/browse/HIVE-6941 Project: Hive Issue Type: Improvement Components: WebHCat Reporter: Shuaishuai Nie Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop interactive. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. The reason is we cannot simply do the same as the endpoint version/hive or version/hadoop since WebHCat does not have dependency with Sqoop. Currently Sqoop 1 support getting the version using command sqoop version. WebHCat can invoke this command using templeton/v1/sqoop endpoint but this is not interactive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-6940: - Description: WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) was:WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-6940: - Description: WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) enablelog (when set to true, WebHCat will upload job log to statusdir. Need to define statusdir when enabled) All the above parameters are optional, but use have to provide either command or optionsfile in the command. was: WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output
[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976362#comment-13976362 ] Hive QA commented on HIVE-6944: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12641125/HIVE-6944.patch {color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 5417 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_17 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitions org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testNameMethods org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartition org.apache.hadoop.hive.ql.exec.TestExecDriver.testMapRedPlan3 org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat.org.apache.hive.hcatalog.mapreduce.TestHCatMultiOutputFormat {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/1/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/1/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 48 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12641125 WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects
[jira] [Commented] (HIVE-6469) skipTrash option in hive command line
[ https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976374#comment-13976374 ] Xuefu Zhang commented on HIVE-6469: --- {quote} set hive.warehouse.data.skipTrash = true – explicitly set drop table large10TBTable – this will skip trash drop table anyOtherTable – this will skip trash set hive.warehouse.data.skipTrash = false – if you forget this, it will skipTrash forever, until corrected. drop table regularTable – this will start placing data in trash {quote} Actually I mean hive.warehouse.data.skipTrash to be an admin property that normal user will be able to set. Thus, the server will have this either on or off. I expect a prod server will have this on while a dev server will have this off. Isn't this good enough? Setting this on/off based on prod/dev seems more reasonable than on table size. If you are in a dev environment, you just disable the feature and why do you care whether the table is big or small. In a prod environment, on the other hand, everything table is important so the feature should be always on. Anything else I'm missing here? skipTrash option in hive command line - Key: HIVE-6469 URL: https://issues.apache.org/jira/browse/HIVE-6469 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.12.0 Reporter: Jayesh Fix For: 0.12.1 Attachments: HIVE-6469.patch hive drop table command deletes the data from HDFS warehouse and puts it into Trash. Currently there is no way to provide flag to tell warehouse to skip trash while deleting table data. This ticket is to add skipTrash feature in hive command-line, that looks as following. hive -e drop table skipTrash testTable This would be good feature to add, so that user can specify when not to put data into trash directory and thus not to fill hdfs space instead of relying on trash interval and policy configuration to take care of disk filling issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6469) skipTrash option in hive command line
[ https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976376#comment-13976376 ] Xuefu Zhang commented on HIVE-6469: --- {quote} that normal user will be able to set. {quote} I meant to say that normal user will NOT be able to set. skipTrash option in hive command line - Key: HIVE-6469 URL: https://issues.apache.org/jira/browse/HIVE-6469 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.12.0 Reporter: Jayesh Fix For: 0.12.1 Attachments: HIVE-6469.patch hive drop table command deletes the data from HDFS warehouse and puts it into Trash. Currently there is no way to provide flag to tell warehouse to skip trash while deleting table data. This ticket is to add skipTrash feature in hive command-line, that looks as following. hive -e drop table skipTrash testTable This would be good feature to add, so that user can specify when not to put data into trash directory and thus not to fill hdfs space instead of relying on trash interval and policy configuration to take care of disk filling issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Hive Contributor
Welcome aboard, Naveen! I have added you as contributor to project. Looking forward to your contributions to Hive. Ashutosh On Mon, Apr 21, 2014 at 7:18 PM, Naveen Gangam ngan...@cloudera.com wrote: Dear Hive PMC, I would like to contribute to the HIVE community. Could you please grant me the contributor role? My apache username is ngangam. Thank you in advance and I am looking forward to becoming a part of the Hive community. -- Thanks, Naveen :)
[jira] [Updated] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6944: - Description: HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS was: HIVE-6432 removed templeton/vq/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6944.patch HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6944) WebHCat e2e tests broken by HIVE-6432
[ https://issues.apache.org/jira/browse/HIVE-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976384#comment-13976384 ] Eugene Koifman commented on HIVE-6944: -- in spite of NO PRECOMMIT TESTS it still ran the tests in any case, this is WebHCat only change so these test failures are not related WebHCat e2e tests broken by HIVE-6432 - Key: HIVE-6944 URL: https://issues.apache.org/jira/browse/HIVE-6944 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6944.patch HIVE-6432 removed templeton/v/queue REST endpoint and broke webhcat e2e tests NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6941) [WebHCat] Complete implementation of webhcat endpoint version/sqoop
[ https://issues.apache.org/jira/browse/HIVE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6941: - Affects Version/s: 0.14.0 [WebHCat] Complete implementation of webhcat endpoint version/sqoop - Key: HIVE-6941 URL: https://issues.apache.org/jira/browse/HIVE-6941 Project: Hive Issue Type: Improvement Components: WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie Since WebHCat support invoking Sqoop job (introduced in HIVE-5072), it should also expose endpoint version/sqoop to return the version of Sqoop interactive. In HIVE-5072, the endpoint version/sqoop is exposed return NOT_IMPLEMENTED_501. The reason is we cannot simply do the same as the endpoint version/hive or version/hadoop since WebHCat does not have dependency with Sqoop. Currently Sqoop 1 support getting the version using command sqoop version. WebHCat can invoke this command using templeton/v1/sqoop endpoint but this is not interactive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976390#comment-13976390 ] Eugene Koifman commented on HIVE-5072: -- +1 (non binding) [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: WebHCat Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, HIVE-5072.3.patch, HIVE-5072.4.patch, HIVE-5072.5.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6924) MapJoinKeyBytes::hashCode() should use Murmur hash
[ https://issues.apache.org/jira/browse/HIVE-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976406#comment-13976406 ] Remus Rusanu commented on HIVE-6924: [~jnp] I created HIVE-6949 for the vectorized hash MapJoinKeyBytes::hashCode() should use Murmur hash -- Key: HIVE-6924 URL: https://issues.apache.org/jira/browse/HIVE-6924 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6924.patch Existing hashCode is bad, causes HashMap to cluster -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6949) VectorHashKeyWrapper hashCode() should use Murmur hash
Remus Rusanu created HIVE-6949: -- Summary: VectorHashKeyWrapper hashCode() should use Murmur hash Key: HIVE-6949 URL: https://issues.apache.org/jira/browse/HIVE-6949 Project: Hive Issue Type: Improvement Reporter: Remus Rusanu Assignee: Remus Rusanu HIVE-6924 replaced the hash of MapJoinKeyBytes with MurmurHash algorithm. Vectorized hash should do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive
[ https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976410#comment-13976410 ] Ted Xu commented on HIVE-5771: -- Hi [~ashutoshc], thanks for the patch, I will look into this. Constant propagation optimizer for Hive --- Key: HIVE-5771 URL: https://issues.apache.org/jira/browse/HIVE-5771 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ted Xu Assignee: Ted Xu Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, HIVE-5771.4.patch, HIVE-5771.5.patch, HIVE-5771.6.patch, HIVE-5771.7.patch, HIVE-5771.8.patch, HIVE-5771.patch Currently there is no constant folding/propagation optimizer, all expressions are evaluated at runtime. HIVE-2470 did a great job on evaluating constants on UDF initializing phase, however, it is still a runtime evaluation and it doesn't propagate constants from a subquery to outside. It may reduce I/O and accelerate process if we introduce such an optimizer. -- This message was sent by Atlassian JIRA (v6.2#6252)