[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770480#comment-13770480 ] Jitendra Nath Pandey commented on HIVE-4642: [~teddy.choi] Thanks for taking care of the serialization part. I think the patch doesn't apply because of the recent changes to the branch. Please rebase the patch again. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4945) Make RLIKE/REGEXP run end-to-end by updating VectorizationContext
[ https://issues.apache.org/jira/browse/HIVE-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770482#comment-13770482 ] Jitendra Nath Pandey commented on HIVE-4945: [~teddy.choi] I have assigned the jira to you. This patch will need to be updated in light of the recent changes to vectorization branch, particularly HIVE-4959. Please rebase the patch. Make RLIKE/REGEXP run end-to-end by updating VectorizationContext - Key: HIVE-4945 URL: https://issues.apache.org/jira/browse/HIVE-4945 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4945.1.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5308) The code generation should be part of the build process.
Jitendra Nath Pandey created HIVE-5308: -- Summary: The code generation should be part of the build process. Key: HIVE-5308 URL: https://issues.apache.org/jira/browse/HIVE-5308 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey We have committed lots of generated source code. Instead, we should generate this code as part of the build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5308) The code generation should be part of the build process.
[ https://issues.apache.org/jira/browse/HIVE-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5308: --- Attachment: HIVE-5308.1.vectorization.patch The code generation should be part of the build process. Key: HIVE-5308 URL: https://issues.apache.org/jira/browse/HIVE-5308 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5308.1.vectorization.patch We have committed lots of generated source code. Instead, we should generate this code as part of the build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5308) The code generation should be part of the build process.
[ https://issues.apache.org/jira/browse/HIVE-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5308: --- Status: Patch Available (was: Open) The code generation should be part of the build process. Key: HIVE-5308 URL: https://issues.apache.org/jira/browse/HIVE-5308 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5308.1.vectorization.patch We have committed lots of generated source code. Instead, we should generate this code as part of the build. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5285) Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.
[ https://issues.apache.org/jira/browse/HIVE-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770493#comment-13770493 ] Hudson commented on HIVE-5285: -- FAILURE: Integrated in Hive-trunk-h0.21 #2338 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2338/]) HIVE-5285 : Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors.(Hari Sankar via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524067) * /hive/trunk/ql/src/test/org/apache/hadoop/hive/serde2/CustomSerDe3.java * /hive/trunk/ql/src/test/queries/clientpositive/partition_wise_fileformat17.q * /hive/trunk/ql/src/test/results/clientpositive/partition_wise_fileformat17.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java Custom SerDes throw cast exception when there are complex nested structures containing NonSettableObjectInspectors. --- Key: HIVE-5285 URL: https://issues.apache.org/jira/browse/HIVE-5285 Project: Hive Issue Type: Bug Affects Versions: 0.11.1 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Fix For: 0.13.0 Attachments: HIVE-5285.1.patch.txt, HIVE-5285.2.patch.txt The approach for HIVE-5199 fix is correct.However, the fix for HIVE-5199 is incomplete. Consider a complex nested structure containing the following object inspector hierarchy: SettableStructObjectInspector { ListObjectInspectorNonSettableStructObjectInspector } In the above case, the cast exception can happen via MapOperator/FetchOperator as below: java.io.IOException: java.lang.ClassCastException: com.skype.data.hadoop.hive.proto.CustomObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:545) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:489) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1412) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.ClassCastException: com.skype.data.whaleshark.hadoop.hive.proto.ProtoMapObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.SettableMapObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:144) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.init(ObjectInspectorConverters.java:294) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:138) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$ListConverter.convert(ObjectInspectorConverters.java:251) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:316) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:529) ... 13 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5309) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java
Jitendra Nath Pandey created HIVE-5309: -- Summary: Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java Key: HIVE-5309 URL: https://issues.apache.org/jira/browse/HIVE-5309 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey This jira provides fixes for some of the review comments on HIVE-5283. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5246) Local task for map join submitted via oozie job fails on a secure HDFS
[ https://issues.apache.org/jira/browse/HIVE-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770494#comment-13770494 ] Hudson commented on HIVE-5246: -- FAILURE: Integrated in Hive-trunk-h0.21 #2338 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2338/]) HIVE-5246 - Local task for map join submitted via oozie job fails on a secure HDFS (Prasad Mujumdar via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524074) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SecureCmdDoAs.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java Local task for map join submitted via oozie job fails on a secure HDFS --- Key: HIVE-5246 URL: https://issues.apache.org/jira/browse/HIVE-5246 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5246.1.patch, HIVE-5246-test.tar For a Hive query started by Oozie Hive action, the local task submitted for Mapjoin fails. The HDFS delegation token is not shared properly with the child JVM created for the local task. Oozie creates a delegation token for the Hive action and sets env variable HADOOP_TOKEN_FILE_LOCATION as well as mapreduce.job.credentials.binary config property. However this doesn't get passed down to the child JVM which causes the problem. This is similar issue addressed by HIVE-4343 which address the problem HiveServer2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5292) Join on decimal columns fails to return rows
[ https://issues.apache.org/jira/browse/HIVE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770495#comment-13770495 ] Hudson commented on HIVE-5292: -- FAILURE: Integrated in Hive-trunk-h0.21 #2338 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2338/]) HIVE-5292 : Join on decimal columns fails to return rows (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524062) * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java * /hive/trunk/ql/src/test/queries/clientpositive/decimal_join.q * /hive/trunk/ql/src/test/results/clientpositive/decimal_join.q.out Join on decimal columns fails to return rows Key: HIVE-5292 URL: https://issues.apache.org/jira/browse/HIVE-5292 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Environment: Linux lnxx64r5 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: Sergio Lob Assignee: Navis Fix For: 0.12.0 Attachments: D12969.1.patch Join on matching decimal columns returns 0 rows To reproduce (I used beeline): 1. create 2 simple identical tables with 2 identical rows: CREATE TABLE SERGDEC(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; CREATE TABLE SERGDEC2(I INT, D DECIMAL) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; 2. populate tables with identical data: LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC ; LOAD DATA LOCAL INPATH './decdata' OVERWRITE INTO TABLE SERGDEC2 ; 3. data file decdata contains: 10|.98 20|1234567890.1234 4. Perform join (returns 0 rows instead of 2): SELECT T1.I, T1.D, T2.D FROM SERGDEC T1 JOIN SERGDEC2 T2 ON T1.D = T2.D ; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5306) Use new GenericUDF instead of basic UDF for UDFAbs class
[ https://issues.apache.org/jira/browse/HIVE-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated HIVE-5306: Attachment: HIVE-5306.1.patch WIP Use new GenericUDF instead of basic UDF for UDFAbs class Key: HIVE-5306 URL: https://issues.apache.org/jira/browse/HIVE-5306 Project: Hive Issue Type: Improvement Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5306.1.patch GenericUDF class is the latest and recommended base class for any UDFs. This JIRA is to change the current UDFAbs class extended from GenericUDF. The general benefit of GenericUDF is described in comments as * The GenericUDF are superior to normal UDFs in the following ways: 1. It can * accept arguments of complex types, and return complex types. 2. It can accept * variable length of arguments. 3. It can accept an infinite number of function * signature - for example, it's easy to write a GenericUDF that accepts * arrayint, arrayarrayint and so on (arbitrary levels of nesting). 4. It * can do short-circuit evaluations using DeferedObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5306) Use new GenericUDF instead of basic UDF for UDFAbs class
[ https://issues.apache.org/jira/browse/HIVE-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated HIVE-5306: Status: Patch Available (was: Open) Use new GenericUDF instead of basic UDF for UDFAbs class Key: HIVE-5306 URL: https://issues.apache.org/jira/browse/HIVE-5306 Project: Hive Issue Type: Improvement Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5306.1.patch GenericUDF class is the latest and recommended base class for any UDFs. This JIRA is to change the current UDFAbs class extended from GenericUDF. The general benefit of GenericUDF is described in comments as * The GenericUDF are superior to normal UDFs in the following ways: 1. It can * accept arguments of complex types, and return complex types. 2. It can accept * variable length of arguments. 3. It can accept an infinite number of function * signature - for example, it's easy to write a GenericUDF that accepts * arrayint, arrayarrayint and so on (arbitrary levels of nesting). 4. It * can do short-circuit evaluations using DeferedObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-5202) Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types.
[ https://issues.apache.org/jira/browse/HIVE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-5202 started by Hari Sankar Sivarama Subramaniyan. Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types. --- Key: HIVE-5202 URL: https://issues.apache.org/jira/browse/HIVE-5202 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan These 3 tasks should be accomplished as part of the following jira: 1. The current implementation lacks settable union object inspector. We can run into exception inside ObjectInspectorConverters.getConvertedOI() if there is a union. 2. Implement the following public functions for all datatypes: isSettable()- Perform shallow check to see if an object inspector is inherited from settableOI type and hasAllFieldsSettable() - Perform deep check to see if this objectInspector and all the underlying object inspectors are inherited from settableOI type. 3. ObjectInspectorConverters.getConvertedOI() is inefficient. Once (1) and (2) are implemented, add the following check: outputOI.hasAllSettableFields() should be added to return outputOI immediately if the object is entirely settable in order to prevent redundant object instantiation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5207) Support data encryption for Hive tables
[ https://issues.apache.org/jira/browse/HIVE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770520#comment-13770520 ] Jerry Chen commented on HIVE-5207: -- HIVE-4227 is specifically for adding column level encryption for ORC files. As we all know, Hive tables support various other formats such as text file, sequence file, RC file and Avro file. HIVE-5207 is targeting to address these file formats as a common problem for supporting encryption. As to key management, from the user’s perspective one rational approach is one table key per encrypted table. In this concept, it is natural to associate the key with TblProperties Support data encryption for Hive tables --- Key: HIVE-5207 URL: https://issues.apache.org/jira/browse/HIVE-5207 Project: Hive Issue Type: New Feature Affects Versions: 0.12.0 Reporter: Jerry Chen Labels: Rhino Original Estimate: 504h Remaining Estimate: 504h For sensitive and legally protected data such as personal information, it is a common practice that the data is stored encrypted in the file system. To enable Hive with the ability to store and query the encrypted data is very crucial for Hive data analysis in enterprise. When creating table, user can specify whether a table is an encrypted table or not by specify a property in TBLPROPERTIES. Once an encrypted table is created, query on the encrypted table is transparent as long as the corresponding key management facilities are set in the running environment of query. We can use hadoop crypto provided by HADOOP-9331 for underlying data encryption and decryption. As to key management, we would support several common key management use cases. First, the table key (data key) can be stored in the Hive metastore associated with the table in properties. The table key can be explicit specified or auto generated and will be encrypted with a master key. There are cases that the data being processed is generated by other applications, we need to support externally managed or imported table keys. Also, the data generated by Hive may be consumed by other applications in the system. We need to a tool or command for exporting the table key to a java keystore for using externally. To handle versions of Hadoop that do not have crypto support, we can avoid compilation problems by segregating crypto API usage into separate files (shims) to be included only if a flag is defined on the Ant command line (something like –Dcrypto=true). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5202) Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types.
[ https://issues.apache.org/jira/browse/HIVE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5202: Attachment: HIVE-5202.1.patch.txt Waiting for the unit tests to complete. Done: 1. Add support for isSettable() in ObjectInspector 2. Rewrote ObjectorInspectorConverters.getConvertedOI() to include caching and hence the performance 3. Added support for settableUnionObjectInspector Pending: 1. Unit tests 2. Test cases for Union embedded within non-primitive data types for partitioned/non-partitioned serdes. 3. Will upload the RB Link once unit tests pass. Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types. --- Key: HIVE-5202 URL: https://issues.apache.org/jira/browse/HIVE-5202 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5202.1.patch.txt These 3 tasks should be accomplished as part of the following jira: 1. The current implementation lacks settable union object inspector. We can run into exception inside ObjectInspectorConverters.getConvertedOI() if there is a union. 2. Implement the following public functions for all datatypes: isSettable()- Perform shallow check to see if an object inspector is inherited from settableOI type and hasAllFieldsSettable() - Perform deep check to see if this objectInspector and all the underlying object inspectors are inherited from settableOI type. 3. ObjectInspectorConverters.getConvertedOI() is inefficient. Once (1) and (2) are implemented, add the following check: outputOI.hasAllSettableFields() should be added to return outputOI immediately if the object is entirely settable in order to prevent redundant object instantiation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5202) Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types.
[ https://issues.apache.org/jira/browse/HIVE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-5202: Status: Patch Available (was: In Progress) Support for SettableUnionObjectInspector and implement isSettable/hasAllFieldsSettable APIs for all data types. --- Key: HIVE-5202 URL: https://issues.apache.org/jira/browse/HIVE-5202 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-5202.1.patch.txt These 3 tasks should be accomplished as part of the following jira: 1. The current implementation lacks settable union object inspector. We can run into exception inside ObjectInspectorConverters.getConvertedOI() if there is a union. 2. Implement the following public functions for all datatypes: isSettable()- Perform shallow check to see if an object inspector is inherited from settableOI type and hasAllFieldsSettable() - Perform deep check to see if this objectInspector and all the underlying object inspectors are inherited from settableOI type. 3. ObjectInspectorConverters.getConvertedOI() is inefficient. Once (1) and (2) are implemented, add the following check: outputOI.hasAllSettableFields() should be added to return outputOI immediately if the object is entirely settable in order to prevent redundant object instantiation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5288) Perflogger should log under single class
[ https://issues.apache.org/jira/browse/HIVE-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770611#comment-13770611 ] Hudson commented on HIVE-5288: -- FAILURE: Integrated in Hive-trunk-h0.21 #2339 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2339/]) HIVE-5288 : Perflogger should log under single class (Sergey Shelukhin via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524278) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java Perflogger should log under single class Key: HIVE-5288 URL: https://issues.apache.org/jira/browse/HIVE-5288 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5288.01.patch, HIVE-5288.01.patch, HIVE-5288.patch Perflogger should log under single class, so that it could be turned on without mass logging spew. Right now the log is passed to it externally; this could be preserved by passing in a string to be logged as part of the message. Anyway most of the time it's called from Driver and Utilities, which is a pretty useless class name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5267) Use array instead of Collections if possible in DemuxOperator
[ https://issues.apache.org/jira/browse/HIVE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770606#comment-13770606 ] Hudson commented on HIVE-5267: -- FAILURE: Integrated in Hive-trunk-h0.21 #2339 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2339/]) HIVE-5267 : Use array instead of Collections if possible in DemuxOperator (Navis via Yin Huai) (yhuai: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524271) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java Use array instead of Collections if possible in DemuxOperator - Key: HIVE-5267 URL: https://issues.apache.org/jira/browse/HIVE-5267 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-5267.D12867.1.patch, HIVE-5267.patch DemuxOperator accesses Maps twice+ for each row, which can be replaced by array. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770610#comment-13770610 ] Hudson commented on HIVE-4443: -- FAILURE: Integrated in Hive-trunk-h0.21 #2339 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2339/]) HIVE-4443: [HCatalog] Have an option for GET queue to return all job information in single call (Daniel Dai via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524232) * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JobItemBean.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Server.java [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, HIVE-4443-4.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770607#comment-13770607 ] Hudson commented on HIVE-5294: -- FAILURE: Integrated in Hive-trunk-h0.21 #2339 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2339/]) HIVE-5294 - Create collect UDF and make evaluator reusable (add missing files) (Edward Capriolo via Brock Noland) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524280) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectList.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java HIVE-5294 - Create collect UDF and make evaluator reusable (Edward Capriolo via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524254) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java * /hive/trunk/ql/src/test/queries/clientpositive/udaf_collect_set.q * /hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out * /hive/trunk/ql/src/test/results/clientpositive/udaf_collect_set.q.out Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.13.0 Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5167) webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh
[ https://issues.apache.org/jira/browse/HIVE-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770609#comment-13770609 ] Hudson commented on HIVE-5167: -- FAILURE: Integrated in Hive-trunk-h0.21 #2339 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2339/]) HIVE-5167: webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh (Thejas M Nair via Daniel Dai, Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524257) * /hive/trunk/hcatalog/webhcat/svr/src/main/bin/webhcat_config.sh webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh --- Key: HIVE-5167 URL: https://issues.apache.org/jira/browse/HIVE-5167 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-5167.1.patch, HIVE-5167.2.patch HIVE-4820 introduced checks for env variables, but it does so before sourcing webhcat-env.sh. This order needs to be reversed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HIVE-5302: -- Attachment: HIVE-5302.1-branch-0.12.patch.txt HIVE-5302.1.patch.txt Patches for trunk and for branch-0.12. this touches lots of .out files, so it will probably go stale quickly. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HIVE-5302: -- Status: Patch Available (was: Open) PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770800#comment-13770800 ] Sean Busbey commented on HIVE-3585: --- and since I forgot to tell the precommit build bot not ot try to test it, that's going to fail since it won't have the futurama_episodes.avro file. Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770817#comment-13770817 ] Ashutosh Chauhan commented on HIVE-3764: +1 Thanks, Prasad for quick turnaround on this one. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.2.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770826#comment-13770826 ] Edward Capriolo commented on HIVE-3585: --- [~busbey] I will commit the avro data (there is no harm in that) where is it supposed to go? Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770834#comment-13770834 ] Edward Capriolo commented on HIVE-5302: --- I am reviewing this now. [~navis] et all. Can I get a second set of eyes on this? PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5301) Add a schema tool for offline metastore schema upgrade
[ https://issues.apache.org/jira/browse/HIVE-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770835#comment-13770835 ] Ashutosh Chauhan commented on HIVE-5301: +1 [~prasadm] At minimum we should test this on MySQL before committing, upgrading schema from 0.7 to 0.12 If you test this on any other db before or after, please leave it in comment here, so we know with which ones testing has been done. Add a schema tool for offline metastore schema upgrade -- Key: HIVE-5301 URL: https://issues.apache.org/jira/browse/HIVE-5301 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-5301.1.patch, HIVE-5301-with-HIVE-3764.0.patch HIVE-3764 is addressing metastore version consistency. Besides it would be helpful to add a tool that can leverage this version information to figure out the required set of upgrade scripts, and execute those against the configured metastore. Now that Hive includes Beeline client, it can be used to execute the scripts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-5310) commit futuama_episodes
[ https://issues.apache.org/jira/browse/HIVE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-5310. --- Resolution: Fixed commit futuama_episodes --- Key: HIVE-5310 URL: https://issues.apache.org/jira/browse/HIVE-5310 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Edward Capriolo Assignee: Edward Capriolo This is a small binary file that will be used for trevni. We can run the pre-commit build if this is committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-5310) commit futuama_episodes
[ https://issues.apache.org/jira/browse/HIVE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-5310 started by Edward Capriolo. commit futuama_episodes --- Key: HIVE-5310 URL: https://issues.apache.org/jira/browse/HIVE-5310 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Edward Capriolo Assignee: Edward Capriolo This is a small binary file that will be used for trevni. We can run the pre-commit build if this is committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request 14180: HIVE-4531: [WebHCat] Collecting task logs to hdfs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14180/ --- (Updated Sept. 18, 2013, 3:20 p.m.) Review request for hive. Changes --- HIVE-4531-9.patch Bugs: HIVE-4531 https://issues.apache.org/jira/browse/HIVE-4531 Repository: hive Description --- SEE HIVE-4531. Diffs (updated) - trunk/hcatalog/src/docs/src/documentation/content/xdocs/hive.xml 1524447 trunk/hcatalog/src/docs/src/documentation/content/xdocs/mapreducejar.xml 1524447 trunk/hcatalog/src/docs/src/documentation/content/xdocs/mapreducestreaming.xml 1524447 trunk/hcatalog/src/docs/src/documentation/content/xdocs/pig.xml 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveJobIDParser.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarDelegator.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarJobIDParser.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JobIDParser.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LauncherDelegator.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LogRetriever.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigDelegator.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigJobIDParser.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Server.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/StreamingDelegator.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonControllerJob.java 1524447 trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/TempletonUtils.java 1524447 trunk/hcatalog/webhcat/svr/src/test/data/status/hive/stderr PRE-CREATION trunk/hcatalog/webhcat/svr/src/test/data/status/jar/stderr PRE-CREATION trunk/hcatalog/webhcat/svr/src/test/data/status/pig/stderr PRE-CREATION trunk/hcatalog/webhcat/svr/src/test/data/status/streaming/stderr PRE-CREATION trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestJobIDParser.java PRE-CREATION trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/tool/TestTempletonUtils.java 1524447 Diff: https://reviews.apache.org/r/14180/diff/ Testing --- WebHCat unit tests e2e tests in HIVE-5078 under both Linux/Windows Thanks, Daniel Dai
[jira] [Updated] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3585: -- Status: Patch Available (was: Open) Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt, HIVE-3585.3.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3585: -- Attachment: HIVE-3585.3.patch.txt Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt, HIVE-3585.3.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-5309) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java
[ https://issues.apache.org/jira/browse/HIVE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HIVE-5309: -- Assignee: Jitendra Nath Pandey Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java --- Key: HIVE-5309 URL: https://issues.apache.org/jira/browse/HIVE-5309 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey This jira provides fixes for some of the review comments on HIVE-5283. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770877#comment-13770877 ] Edward Capriolo commented on HIVE-3585: --- Re-uplloaded the patch and hit SUBMIT_PATCH testing should begin soon. Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt, HIVE-3585.3.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770850#comment-13770850 ] Sean Busbey commented on HIVE-3585: --- It goes in _data/files/_ Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor Attachments: futurama_episodes.avro, HIVE-3585.1.patch.txt, HIVE-3585.2.patch.txt add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5166) TestWebHCatE2e is failing intermittently on trunk
[ https://issues.apache.org/jira/browse/HIVE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5166: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed on trunk. Thanks, Eugene! TestWebHCatE2e is failing intermittently on trunk - Key: HIVE-5166 URL: https://issues.apache.org/jira/browse/HIVE-5166 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5166.patch I observed these while running full test suite last couple of times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770932#comment-13770932 ] Brock Noland commented on HIVE-5302: The change looks reasonable to me. About this change Ashutosh said [Your changes in MetaStoreUtils are indeed reasonable. I just wanted to make sure whether they are really needed. If you can come up with a testcase, which shows the failure without changes in MetaStoreUtils, that will make it easier to concretize why these changes are useful.|https://issues.apache.org/jira/browse/HIVE-4789?focusedCommentId=13732634page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13732634]; in HIVE-4789. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5302: --- Attachment: HIVE-5302.1.patch.txt Reuploading the trunk the patch so it gets tested. The script just takes the lastest patch. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770938#comment-13770938 ] Sean Busbey commented on HIVE-5302: --- In case I didn't make this clear enough in the RB, the additional query added avro_partitioned.q does fail without the changes to MetaStoreUtils. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770947#comment-13770947 ] Brock Noland commented on HIVE-5302: Yep :) PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770983#comment-13770983 ] Ashutosh Chauhan commented on HIVE-5302: Secondly, if you look at .xml file changes it clearly shows it will bloat plan with unnecessary info that is not required at execution time. I really think we should spend more time on getting your test case to work in less intrusive fashion. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770975#comment-13770975 ] Ashutosh Chauhan commented on HIVE-5302: And as an extension to that all table level properties will now also automagically appear as partition properties which doesn't feel right. Normally, it should never be a requirement that partition need to know table properties. Problem arises because of weirdity in how AvroSerde works since it stores its schema in properties object instead of in metastore columns table. I think this problem is too specific to Avro, so this should be done in Avro specific code, AvroSerde perhaps. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5166) TestWebHCatE2e is failing intermittently on trunk
[ https://issues.apache.org/jira/browse/HIVE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770979#comment-13770979 ] Hudson commented on HIVE-5166: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #105 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/105/]) HIVE-5166 : TestWebHCatE2e is failing intermittently on trunk (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524441) * /hive/trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java TestWebHCatE2e is failing intermittently on trunk - Key: HIVE-5166 URL: https://issues.apache.org/jira/browse/HIVE-5166 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5166.patch I observed these while running full test suite last couple of times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5310) commit futuama_episodes
[ https://issues.apache.org/jira/browse/HIVE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770980#comment-13770980 ] Hudson commented on HIVE-5310: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #105 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/105/]) HIVE-5310 futurama-episodes (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524448) * /hive/trunk/data/files/futurama_episodes.avro commit futuama_episodes --- Key: HIVE-5310 URL: https://issues.apache.org/jira/browse/HIVE-5310 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Edward Capriolo Assignee: Edward Capriolo This is a small binary file that will be used for trevni. We can run the pre-commit build if this is committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5311) TestHCatPartitionPublish can fail randomly
Brock Noland created HIVE-5311: -- Summary: TestHCatPartitionPublish can fail randomly Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5311) TestHCatPartitionPublish can fail randomly
[ https://issues.apache.org/jira/browse/HIVE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5311: --- Status: Patch Available (was: Open) TestHCatPartitionPublish can fail randomly -- Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-5311.patch {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770968#comment-13770968 ] Ashutosh Chauhan commented on HIVE-5302: Thanks, [~busbey] for coming up with a testcase. Though, changes in .q.out files indicate that it will make explain extended confusing for people, since partition properties will now list numPartitions which should really be shown as table properties. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5311) TestHCatPartitionPublish can fail randomly
[ https://issues.apache.org/jira/browse/HIVE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5311: --- Attachment: HIVE-5311.patch TestHCatPartitionPublish can fail randomly -- Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-5311.patch {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5209) JDBC support for varchar
[ https://issues.apache.org/jira/browse/HIVE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771000#comment-13771000 ] Jason Dere commented on HIVE-5209: -- Will bump version number, also Thejas mentioned offline that the HiveServer1 JDBC changes aren't necessary as it is deprecated, so I will remove those as well. JDBC support for varchar Key: HIVE-5209 URL: https://issues.apache.org/jira/browse/HIVE-5209 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC, Types Reporter: Jason Dere Assignee: Jason Dere Attachments: D12999.1.patch, HIVE-5209.1.patch, HIVE-5209.2.patch, HIVE-5209.D12705.1.patch Support returning varchar length in result set metadata -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5311) TestHCatPartitionPublish can fail randomly
[ https://issues.apache.org/jira/browse/HIVE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771004#comment-13771004 ] Ashutosh Chauhan commented on HIVE-5311: +1 TestHCatPartitionPublish can fail randomly -- Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-5311.patch {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4624) Integrate Vectorized Substr into Vectorized QE
[ https://issues.apache.org/jira/browse/HIVE-4624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771010#comment-13771010 ] Eric Hanson commented on HIVE-4624: --- After the patch for CONCAT goes in (HIVE-4512), this patch needs to be re-based and updated. Integrate Vectorized Substr into Vectorized QE -- Key: HIVE-4624 URL: https://issues.apache.org/jira/browse/HIVE-4624 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Timothy Chen Assignee: Eric Hanson Attachments: HIVE-4624.1-vectorization.patch Need to hook up the Vectorized Substr directly into Hive Vectorized QE so it can be leveraged. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771012#comment-13771012 ] Edward Capriolo commented on HIVE-5302: --- So is it the case is that avro-serde was working but some other change in hive-11, broke already existing functionality? I do not see a huge problem with table properties showing in partition properies as long as the two do not collide/clash with each other. However if there is a cleaner way to do this without bloating the plan that seems like a reasonable endeavor. Does anyway have a concrete suggestion as to how this could be written instead? PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5311) TestHCatPartitionPublish can fail randomly
[ https://issues.apache.org/jira/browse/HIVE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5311: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Brock! TestHCatPartitionPublish can fail randomly -- Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5311.patch {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5291) join32_lessSize.q has ordering problem under hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5291: --- Fix Version/s: (was: 0.13.0) Status: Open (was: Patch Available) join32_lessSize.q has ordering problem under hadoop-2 - Key: HIVE-5291 URL: https://issues.apache.org/jira/browse/HIVE-5291 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5291.patch Test just needs more ORDER BY and output updated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-5291) join32_lessSize.q has ordering problem under hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland resolved HIVE-5291. Resolution: Cannot Reproduce I cannot reproduce this Java7 on trunk. join32_lessSize.q has ordering problem under hadoop-2 - Key: HIVE-5291 URL: https://issues.apache.org/jira/browse/HIVE-5291 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5291.patch Test just needs more ORDER BY and output updated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771045#comment-13771045 ] Thejas M Nair commented on HIVE-4487: - Yes, I will commit it to 0.12 branch. Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5312) Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode
Vaibhav Gumashta created HIVE-5312: -- Summary: Let HiveServer2 run simultaneously in HTTP (over thrift) and Binary (normal thrift transport) mode Key: HIVE-5312 URL: https://issues.apache.org/jira/browse/HIVE-5312 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 [HIVE-4763|https://issues.apache.org/jira/browse/HIVE-4763] adds support for HTTP transport over thrift. With that, HS2 can be configured to run either using using HTTP or using normal thrift binary transport. Ideally HS2 should be supporting both modes simultaneously and the client should be able to specify the mode used in serving the request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771080#comment-13771080 ] Hive QA commented on HIVE-5302: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603850/HIVE-5302.1.patch.txt {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 3126 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1 {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/804/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/804/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5167) webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh
[ https://issues.apache.org/jira/browse/HIVE-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771087#comment-13771087 ] Hudson commented on HIVE-5167: -- FAILURE: Integrated in Hive-trunk-hadoop2 #437 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/437/]) HIVE-5167: webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh (Thejas M Nair via Daniel Dai, Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524257) * /hive/trunk/hcatalog/webhcat/svr/src/main/bin/webhcat_config.sh webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh --- Key: HIVE-5167 URL: https://issues.apache.org/jira/browse/HIVE-5167 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-5167.1.patch, HIVE-5167.2.patch HIVE-4820 introduced checks for env variables, but it does so before sourcing webhcat-env.sh. This order needs to be reversed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771086#comment-13771086 ] Hudson commented on HIVE-: -- FAILURE: Integrated in Hive-trunk-hadoop2 #437 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/437/]) HIVE-: [HCatalog] WebHCat Hive should support equivalent parameters as Pig (Daniel Dai via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524234) * /hive/trunk/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Server.java [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch, HIVE--4.patch, HIVE--5.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5300) MapredLocalTask logs success message twice
[ https://issues.apache.org/jira/browse/HIVE-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771090#comment-13771090 ] Hudson commented on HIVE-5300: -- FAILURE: Integrated in Hive-trunk-hadoop2 #437 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/437/]) HIVE-5300 : MapredLocalTask logs success message twice (Navis via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524277) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java MapredLocalTask logs success message twice -- Key: HIVE-5300 URL: https://issues.apache.org/jira/browse/HIVE-5300 Project: Hive Issue Type: Improvement Components: Logging Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.13.0 Attachments: HIVE-5300.1.patch.txt Something like this, {noformat} Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5294) Create collect UDF and make evaluator reusable
[ https://issues.apache.org/jira/browse/HIVE-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771085#comment-13771085 ] Hudson commented on HIVE-5294: -- FAILURE: Integrated in Hive-trunk-hadoop2 #437 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/437/]) HIVE-5294 - Create collect UDF and make evaluator reusable (add missing files) (Edward Capriolo via Brock Noland) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524280) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectList.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java HIVE-5294 - Create collect UDF and make evaluator reusable (Edward Capriolo via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524254) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java * /hive/trunk/ql/src/test/queries/clientpositive/udaf_collect_set.q * /hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out * /hive/trunk/ql/src/test/results/clientpositive/udaf_collect_set.q.out Create collect UDF and make evaluator reusable -- Key: HIVE-5294 URL: https://issues.apache.org/jira/browse/HIVE-5294 Project: Hive Issue Type: New Feature Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.13.0 Attachments: HIVE-5294.1.patch.txt, HIVE-5294.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5311) TestHCatPartitionPublish can fail randomly
[ https://issues.apache.org/jira/browse/HIVE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771093#comment-13771093 ] Hudson commented on HIVE-5311: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #172 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/172/]) HIVE-5311 : TestHCatPartitionPublish can fail randomly (Brock Noland via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524515) * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/mapreduce/TestHCatPartitionPublish.java * /hive/trunk/hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitionPublish.java TestHCatPartitionPublish can fail randomly -- Key: HIVE-5311 URL: https://issues.apache.org/jira/browse/HIVE-5311 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5311.patch {noformat} org.apache.thrift.TApplicationException: create_table_with_environment_context failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table_with_environment_context(ThriftHiveMetastore.java:793) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table_with_environment_context(ThriftHiveMetastore.java:779) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.createTable(TestHCatPartitionPublish.java:241) at org.apache.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:133) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771092#comment-13771092 ] Hudson commented on HIVE-4487: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #172 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/172/]) HIVE-4487 - Hive does not set explicit permissions on hive.exec.scratchdir (Chaoyu Tang via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524509) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5166) TestWebHCatE2e is failing intermittently on trunk
[ https://issues.apache.org/jira/browse/HIVE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771091#comment-13771091 ] Hudson commented on HIVE-5166: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #172 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/172/]) HIVE-5166 : TestWebHCatE2e is failing intermittently on trunk (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524441) * /hive/trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java TestWebHCatE2e is failing intermittently on trunk - Key: HIVE-5166 URL: https://issues.apache.org/jira/browse/HIVE-5166 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5166.patch I observed these while running full test suite last couple of times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5310) commit futuama_episodes
[ https://issues.apache.org/jira/browse/HIVE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771108#comment-13771108 ] Hudson commented on HIVE-5310: -- FAILURE: Integrated in Hive-trunk-hadoop2 #438 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/438/]) HIVE-5310 futurama-episodes (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524448) * /hive/trunk/data/files/futurama_episodes.avro commit futuama_episodes --- Key: HIVE-5310 URL: https://issues.apache.org/jira/browse/HIVE-5310 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Edward Capriolo Assignee: Edward Capriolo This is a small binary file that will be used for trevni. We can run the pre-commit build if this is committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771106#comment-13771106 ] Hudson commented on HIVE-4487: -- FAILURE: Integrated in Hive-trunk-hadoop2 #438 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/438/]) HIVE-4487 - Hive does not set explicit permissions on hive.exec.scratchdir (Chaoyu Tang via Brock Noland) (brock: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524509) * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Context.java Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5166) TestWebHCatE2e is failing intermittently on trunk
[ https://issues.apache.org/jira/browse/HIVE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771105#comment-13771105 ] Hudson commented on HIVE-5166: -- FAILURE: Integrated in Hive-trunk-hadoop2 #438 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/438/]) HIVE-5166 : TestWebHCatE2e is failing intermittently on trunk (Eugene Koifman via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524441) * /hive/trunk/hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java TestWebHCatE2e is failing intermittently on trunk - Key: HIVE-5166 URL: https://issues.apache.org/jira/browse/HIVE-5166 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.12.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5166.patch I observed these while running full test suite last couple of times. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5310) commit futuama_episodes
[ https://issues.apache.org/jira/browse/HIVE-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771094#comment-13771094 ] Hudson commented on HIVE-5310: -- FAILURE: Integrated in Hive-trunk-hadoop1-ptest #172 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/172/]) HIVE-5310 futurama-episodes (ecapriolo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524448) * /hive/trunk/data/files/futurama_episodes.avro commit futuama_episodes --- Key: HIVE-5310 URL: https://issues.apache.org/jira/browse/HIVE-5310 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers Reporter: Edward Capriolo Assignee: Edward Capriolo This is a small binary file that will be used for trevni. We can run the pre-commit build if this is committed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771088#comment-13771088 ] Hudson commented on HIVE-4443: -- FAILURE: Integrated in Hive-trunk-hadoop2 #437 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/437/]) HIVE-4443: [HCatalog] Have an option for GET queue to return all job information in single call (Daniel Dai via Thejas Nair) (thejas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1524232) * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JobItemBean.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Main.java * /hive/trunk/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/Server.java [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, HIVE-4443-4.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4782) Templeton streaming bug fixes
[ https://issues.apache.org/jira/browse/HIVE-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4782: - Assignee: Shuaishuai Nie Templeton streaming bug fixes - Key: HIVE-4782 URL: https://issues.apache.org/jira/browse/HIVE-4782 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4782.1.patch 1. There are no means to point to dist cache binaries for streaming. This blocks the mainstream scenario where customer uploads its binaries to ASV and wants to use those in his job. 2. Command line options passed to hadoop.cmd that can contain an equal sign or a comma must be quoted 3. Fix -file and -cmdenv streaming options which do not seem to work properly thru Templeton 4. Also add -combiner option to enable adding combiner in the streaming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4755) Fix Templeton map-only tasks are getting killed after 10 minutes by MapReduce
[ https://issues.apache.org/jira/browse/HIVE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4755: - Assignee: Shuaishuai Nie Fix Templeton map-only tasks are getting killed after 10 minutes by MapReduce - Key: HIVE-4755 URL: https://issues.apache.org/jira/browse/HIVE-4755 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4755.1.patch All mapreduce tasks are supposed to report progress back to MR framework, otherwise, if there is no progress in 10 minutes (by default), TT will kill the task. Templeton map-only task has a KeepAlive thread that is supposed to periodically report the progress and keep the launcher alive, however this does not seem to work as expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4782) Templeton streaming bug fixes
[ https://issues.apache.org/jira/browse/HIVE-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4782: - Status: Patch Available (was: Open) Templeton streaming bug fixes - Key: HIVE-4782 URL: https://issues.apache.org/jira/browse/HIVE-4782 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4782.1.patch 1. There are no means to point to dist cache binaries for streaming. This blocks the mainstream scenario where customer uploads its binaries to ASV and wants to use those in his job. 2. Command line options passed to hadoop.cmd that can contain an equal sign or a comma must be quoted 3. Fix -file and -cmdenv streaming options which do not seem to work properly thru Templeton 4. Also add -combiner option to enable adding combiner in the streaming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5224) When creating table with AVRO serde, the avro.schema.url should be about to load serde schema from file system beside HDFS
[ https://issues.apache.org/jira/browse/HIVE-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-5224: - Status: Patch Available (was: Open) When creating table with AVRO serde, the avro.schema.url should be about to load serde schema from file system beside HDFS Key: HIVE-5224 URL: https://issues.apache.org/jira/browse/HIVE-5224 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5224.1.patch, HIVE-5224.2.patch Now when loading schema for table with AVRO serde, the file system is hard coded to hdfs in AvroSerdeUtils.java. This should enable loading schema from file system beside hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5032) Enable hive creating external table at the root directory of DFS
[ https://issues.apache.org/jira/browse/HIVE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-5032: - Assignee: Shuaishuai Nie Enable hive creating external table at the root directory of DFS Key: HIVE-5032 URL: https://issues.apache.org/jira/browse/HIVE-5032 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5032.1.patch Creating external table using HIVE with location point to the root directory of DFS will fail because the function HiveFileFormatUtils#doGetPartitionDescFromPath treat authority of the path the same as folder and cannot find a match in the pathToPartitionInfo table when doing prefix match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-5072: - Status: Patch Available (was: Open) [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton
[ https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-5072: - Assignee: Shuaishuai Nie [WebHCat]Enable directly invoke Sqoop job through Templeton --- Key: HIVE-5072 URL: https://issues.apache.org/jira/browse/HIVE-5072 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 0.12.0 Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, Templeton-Sqoop-Action.pdf Now it is hard to invoke a Sqoop job through templeton. The only way is to use the classpath jar generated by a sqoop job and use the jar delegator in Templeton. We should implement Sqoop Delegator to enable directly invoke Sqoop job through Templeton. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4755) Fix Templeton map-only tasks are getting killed after 10 minutes by MapReduce
[ https://issues.apache.org/jira/browse/HIVE-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4755: - Status: Patch Available (was: Open) Fix Templeton map-only tasks are getting killed after 10 minutes by MapReduce - Key: HIVE-4755 URL: https://issues.apache.org/jira/browse/HIVE-4755 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4755.1.patch All mapreduce tasks are supposed to report progress back to MR framework, otherwise, if there is no progress in 10 minutes (by default), TT will kill the task. Templeton map-only task has a KeepAlive thread that is supposed to periodically report the progress and keep the launcher alive, however this does not seem to work as expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4760) Templenton occasional raises bad request (HTTP error 400) for queue/:jobid requests
[ https://issues.apache.org/jira/browse/HIVE-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4760: - Assignee: Shuaishuai Nie Templenton occasional raises bad request (HTTP error 400) for queue/:jobid requests --- Key: HIVE-4760 URL: https://issues.apache.org/jira/browse/HIVE-4760 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4760.1.patch This issue randomly occurs. The repro we have found is to issue a job and then write a loop that calls queue/:jobid over and over again until the job completes. At some point depending on the timing, a bad request (jobid is not valid) - is raised. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5306) Use new GenericUDF instead of basic UDF for UDFAbs class
[ https://issues.apache.org/jira/browse/HIVE-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated HIVE-5306: Attachment: HIVE-5306.3.patch With more testcases. Use new GenericUDF instead of basic UDF for UDFAbs class Key: HIVE-5306 URL: https://issues.apache.org/jira/browse/HIVE-5306 Project: Hive Issue Type: Improvement Components: UDF Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5306.1.patch, HIVE-5306.2.patch, HIVE-5306.3.patch GenericUDF class is the latest and recommended base class for any UDFs. This JIRA is to change the current UDFAbs class extended from GenericUDF. The general benefit of GenericUDF is described in comments as * The GenericUDF are superior to normal UDFs in the following ways: 1. It can * accept arguments of complex types, and return complex types. 2. It can accept * variable length of arguments. 3. It can accept an infinite number of function * signature - for example, it's easy to write a GenericUDF that accepts * arrayint, arrayarrayint and so on (arbitrary levels of nesting). 4. It * can do short-circuit evaluations using DeferedObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5302) PartitionPruner fails on Avro non-partitioned data
[ https://issues.apache.org/jira/browse/HIVE-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771115#comment-13771115 ] Mark Wagner commented on HIVE-5302: --- Yes, that's right [~appodictic]. This ultimately traces back to the changes made in HIVE-3833. [~busbey], I'm having difficulty reproducing the failure. I've added your changes to avro_partitioned, as well as a select from the table. It goes through cleanly and the result looks correct: {noformat} An Unearthly Child 23 November 19631 The Power of the Daleks 5 November 1966 2 Horror of Fang Rock 3 September 19774 Castrolava 4 January 1982 5 The Mysterious Planet 6 September 19866 {noformat} I ran this on trunk and pulled right before running. Any idea what might be different between us? How did the test case fail for you without the MetaStoreUtils changes? PartitionPruner fails on Avro non-partitioned data -- Key: HIVE-5302 URL: https://issues.apache.org/jira/browse/HIVE-5302 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Labels: avro Attachments: HIVE-5302.1-branch-0.12.patch.txt, HIVE-5302.1.patch.txt, HIVE-5302.1.patch.txt While updating HIVE-3585 I found a test case that causes the failure in the MetaStoreUtils partition retrieval from back in HIVE-4789. in this case, the failure is triggered when the partition pruner is handed a non-partitioned table and has to construct a pseudo-partition. e.g. {code} INSERT OVERWRITE TABLE partitioned_table PARTITION(col) SELECT id, foo, col FROM non_partitioned_table WHERE col = 9; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771160#comment-13771160 ] Yin Huai commented on HIVE-4487: FsPermission(String mode) was not in hadoop 0.20.2. Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771162#comment-13771162 ] Yin Huai commented on HIVE-4487: If we use ant clean package eclipse-files, we cannot build hive inside eclipse. Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4340) ORC should provide raw data size
[ https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771191#comment-13771191 ] Ashutosh Chauhan commented on HIVE-4340: Thanks [~prasanth_j] for picking this one up. I will suggest to break the patch into two: one which proposes new stats gathering and providing interfaces on RecordWriter and RecordReader. And another jira for ORC implementation of these two. ORC should provide raw data size Key: HIVE-4340 URL: https://issues.apache.org/jira/browse/HIVE-4340 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4340.1.patch.txt, HIVE-4340.2.patch.txt, HIVE-4340.3.patch.txt, HIVE-4340.4.patch.txt, HIVE-4340-java-only.4.patch.txt ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4340) ORC should provide raw data size
[ https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4340: --- Assignee: Prasanth J (was: Kevin Wilfong) ORC should provide raw data size Key: HIVE-4340 URL: https://issues.apache.org/jira/browse/HIVE-4340 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Prasanth J Attachments: HIVE-4340.1.patch.txt, HIVE-4340.2.patch.txt, HIVE-4340.3.patch.txt, HIVE-4340.4.patch.txt, HIVE-4340-java-only.4.patch.txt ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3764) Support metastore version consistency check
[ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771195#comment-13771195 ] Prasad Mujumdar commented on HIVE-3764: --- Thanks Ashutosh! I will rebase the patch on trunk and upload patches trunk and 0.12 later today. Support metastore version consistency check --- Key: HIVE-3764 URL: https://issues.apache.org/jira/browse/HIVE-3764 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-3764.1.patch, HIVE-3764.2.patch Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc. Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: How long will we support Hadoop 0.20.2?
I am fine with dropping support for 0.20 line. 4 years is a long time. We cannot keep accumulating tech debt forever. Ashutosh On Wed, Sep 18, 2013 at 1:04 PM, Brock Noland br...@cloudera.com wrote: Hi, At present we require compatibility with Hadoop 0.20.2. See: https://issues.apache.org/jira/browse/HIVE-5313 Considering 0.20.2 was released 4 years ago, how long are we going to continue to support it? Brock
[jira] [Updated] (HIVE-5309) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java
[ https://issues.apache.org/jira/browse/HIVE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5309: --- Attachment: HIVE-5309.1.vectorization.patch Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java --- Key: HIVE-5309 URL: https://issues.apache.org/jira/browse/HIVE-5309 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5309.1.vectorization.patch This jira provides fixes for some of the review comments on HIVE-5283. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5309) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java
[ https://issues.apache.org/jira/browse/HIVE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5309: --- Description: This jira provides fixes for some of the review comments on HIVE-5283. 1) Update hive-default.xml.template for vectorization flag. 2) remove unused imports from MetaStoreUtils. 3) Add a test to run vectorization with non-orc format. The test must still pass because vectorization optimization should fall back to non-vector mode. 4) Hardcode the table name in QTestUtil.java. was:This jira provides fixes for some of the review comments on HIVE-5283. Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java --- Key: HIVE-5309 URL: https://issues.apache.org/jira/browse/HIVE-5309 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5309.1.vectorization.patch This jira provides fixes for some of the review comments on HIVE-5283. 1) Update hive-default.xml.template for vectorization flag. 2) remove unused imports from MetaStoreUtils. 3) Add a test to run vectorization with non-orc format. The test must still pass because vectorization optimization should fall back to non-vector mode. 4) Hardcode the table name in QTestUtil.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5313) HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string)
[ https://issues.apache.org/jira/browse/HIVE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5313: --- Summary: HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string) (was: Shim out FSPermission change in HIVE-4487 due to 0.20.2) HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string) - Key: HIVE-5313 URL: https://issues.apache.org/jira/browse/HIVE-5313 Project: Hive Issue Type: Task Reporter: Brock Noland As per HIVE-4487, 0.20.2 does not contain FSPermission(string) so we'll have to shim it out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5313) HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string)
[ https://issues.apache.org/jira/browse/HIVE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5313: --- Assignee: Brock Noland Status: Patch Available (was: Open) HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string) - Key: HIVE-5313 URL: https://issues.apache.org/jira/browse/HIVE-5313 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5313.patch As per HIVE-4487, 0.20.2 does not contain FSPermission(string) so we'll have to shim it out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5313) HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string)
[ https://issues.apache.org/jira/browse/HIVE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5313: --- Attachment: HIVE-5313.patch Trivial patch should fix this. HIVE-4487 breaks build because 0.20.2 is missing FSPermission(string) - Key: HIVE-5313 URL: https://issues.apache.org/jira/browse/HIVE-5313 Project: Hive Issue Type: Task Reporter: Brock Noland Attachments: HIVE-5313.patch As per HIVE-4487, 0.20.2 does not contain FSPermission(string) so we'll have to shim it out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5313) Shim out FSPermission change in HIVE-4487 due to 0.20.2
Brock Noland created HIVE-5313: -- Summary: Shim out FSPermission change in HIVE-4487 due to 0.20.2 Key: HIVE-5313 URL: https://issues.apache.org/jira/browse/HIVE-5313 Project: Hive Issue Type: Task Reporter: Brock Noland As per HIVE-4487, 0.20.2 does not contain FSPermission(string) so we'll have to shim it out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4487) Hive does not set explicit permissions on hive.exec.scratchdir
[ https://issues.apache.org/jira/browse/HIVE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771165#comment-13771165 ] Brock Noland commented on HIVE-4487: I created HIVE-5313 to fix this. Hive does not set explicit permissions on hive.exec.scratchdir -- Key: HIVE-4487 URL: https://issues.apache.org/jira/browse/HIVE-4487 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Joey Echeverria Assignee: Chaoyu Tang Fix For: 0.13.0 Attachments: HIVE-4487.patch The hive.exec.scratchdir defaults to /tmp/hive-$\{user.name\}, but when Hive creates this directory it doesn't set any explicit permission on it. This means if you have the default HDFS umask setting of 022, then these directories end up being world readable. These permissions also get applied to the staging directories and their files, thus leaving inter-stage data world readable. This can cause a potential leak of data especially when operating on a Kerberos enabled cluster. Hive should probably default these directories to only be readable by the owner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
How long will we support Hadoop 0.20.2?
Hi, At present we require compatibility with Hadoop 0.20.2. See: https://issues.apache.org/jira/browse/HIVE-5313 Considering 0.20.2 was released 4 years ago, how long are we going to continue to support it? Brock
[jira] [Commented] (HIVE-5298) AvroSerde performance problem caused by HIVE-3833
[ https://issues.apache.org/jira/browse/HIVE-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771208#comment-13771208 ] Hive QA commented on HIVE-5298: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603614/HIVE-5298.1.patch {color:green}SUCCESS:{color} +1 3126 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/806/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/806/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. AvroSerde performance problem caused by HIVE-3833 - Key: HIVE-5298 URL: https://issues.apache.org/jira/browse/HIVE-5298 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5298.1.patch, HIVE-5298.patch HIVE-3833 fixed the targeted problem and made Hive to use partition-level metadata to initialize object inspector. In doing that, however, it goes thru every file under the table to access the partition metadata, which is very inefficient, especially in case of multiple files per partition. This causes more problem for AvroSerde because AvroSerde initialization accesses schema, which is located on file system. As a result, before hive can process any data, it needs to access every file for a table, which can take long enough to cause job failure because of lack of job progress. The improvement can be made so that partition metadata is only access once per partition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5314) Commit vectorization test data, comment/rename vectorization tests.
Jitendra Nath Pandey created HIVE-5314: -- Summary: Commit vectorization test data, comment/rename vectorization tests. Key: HIVE-5314 URL: https://issues.apache.org/jira/browse/HIVE-5314 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Tony Murphy Based on comments on HIVE-5823, we should commit 'alltypesorc' and provides some comments on vectorization tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5309) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java
[ https://issues.apache.org/jira/browse/HIVE-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5309: --- Status: Patch Available (was: Open) Update hive-default.xml.template for vectorization flag; remove unused imports from MetaStoreUtils.java --- Key: HIVE-5309 URL: https://issues.apache.org/jira/browse/HIVE-5309 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5309.1.vectorization.patch This jira provides fixes for some of the review comments on HIVE-5283. 1) Update hive-default.xml.template for vectorization flag. 2) remove unused imports from MetaStoreUtils. 3) Add a test to run vectorization with non-orc format. The test must still pass because vectorization optimization should fall back to non-vector mode. 4) Hardcode the table name in QTestUtil.java. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5315) bin/hive should retrieve HADOOP_VERSION better way.
Kousuke Saruta created HIVE-5315: Summary: bin/hive should retrieve HADOOP_VERSION better way. Key: HIVE-5315 URL: https://issues.apache.org/jira/browse/HIVE-5315 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Kousuke Saruta Fix For: 0.11.1 In current implementation, bin/hive retrieve HADOOP_VERSION like as follows {code} HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}'); {code} But, sometimes, hadoop version doesn't show version information at first line. If HADOOP_VERSION is not retrieve collectly, Hive or related processes will not be up. I faced this situation when I try to debug Hiveserver2 with debug option like as follows {code} -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876 {code} Then, hadoop version shows -Xdebug... at the first line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5315) bin/hive should retrieve HADOOP_VERSION better way.
[ https://issues.apache.org/jira/browse/HIVE-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated HIVE-5315: - Description: In current implementation, bin/hive retrieves HADOOP_VERSION like as follows {code} HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}'); {code} But, sometimes, hadoop version doesn't show version information at the first line. If HADOOP_VERSION is not retrieve collectly, Hive or related processes will not be up. I faced this situation when I try to debug Hiveserver2 with debug option like as follows {code} -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876 {code} Then, hadoop version shows -Xdebug... at the first line. was: In current implementation, bin/hive retrieve HADOOP_VERSION like as follows {code} HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}'); {code} But, sometimes, hadoop version doesn't show version information at first line. If HADOOP_VERSION is not retrieve collectly, Hive or related processes will not be up. I faced this situation when I try to debug Hiveserver2 with debug option like as follows {code} -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876 {code} Then, hadoop version shows -Xdebug... at the first line. bin/hive should retrieve HADOOP_VERSION better way. --- Key: HIVE-5315 URL: https://issues.apache.org/jira/browse/HIVE-5315 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Kousuke Saruta Fix For: 0.11.1 In current implementation, bin/hive retrieves HADOOP_VERSION like as follows {code} HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}'); {code} But, sometimes, hadoop version doesn't show version information at the first line. If HADOOP_VERSION is not retrieve collectly, Hive or related processes will not be up. I faced this situation when I try to debug Hiveserver2 with debug option like as follows {code} -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876 {code} Then, hadoop version shows -Xdebug... at the first line. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira