[jira] [Created] (HIVE-11021) ObjectStore should call closeAll() on JDO query object to release the resources
Aihua Xu created HIVE-11021: --- Summary: ObjectStore should call closeAll() on JDO query object to release the resources Key: HIVE-11021 URL: https://issues.apache.org/jira/browse/HIVE-11021 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu In ObjectStore class, in getMDatabase() and getMTable(), after retrieving the database and table info from the database, we should call closeAll() on JDO query to release the resource. It would cause the cursor leaking on the database otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11023) Disable directSQL if datanucleus.identifierFactory = datanucleus2
Sushanth Sowmyan created HIVE-11023: --- Summary: Disable directSQL if datanucleus.identifierFactory = datanucleus2 Key: HIVE-11023 URL: https://issues.apache.org/jira/browse/HIVE-11023 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.3.0, 1.2.1, 2.0.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan We hit an interesting bug in a case where datanucleus.identifierFactory = datanucleus2 . The problem is that directSql handgenerates SQL strings assuming datanucleus1 naming scheme. If a user has their metastore JDO managed by datanucleus.identifierFactory = datanucleus2 , the SQL strings we generate are incorrect. One simple example of what this results in is the following: whenever DN persists a field which is held as a ListT, it winds up storing each T as a separate line in the appropriate mapping table, and has a column called INTEGER_IDX, which holds the position in the list. Then, upon reading, it automatically reads all relevant lines with an ORDER BY INTEGER_IDX, which results in the list retaining its order. In DN2 naming scheme, the column is called IDX, instead of INTEGER_IDX. If the user has run appropriate metatool upgrade scripts, it is highly likely that they have both columns, INTEGER_IDX and IDX. Whenever they use JDO, such as with all writes, it will then use the IDX field, and when they do any sort of optimized reads, such as through directSQL, it will ORDER BY INTEGER_IDX. An immediate danger is seen when we consider that the schema of a table is stored as a ListFieldSchema , and while IDX has 0,1,2,3,... , INTEGER_IDX will contain 0,0,0,0,... and thus, any attempt to describe the table or fetch schema for the table can come up mixed up in the table's native hashing order, rather than sorted by the index. This can then result in schema ordering being different from the actual table. For eg:, if a user has a (a:int,b:string,c:string), a describe on this may return (c:string, a:int, b: string), and thus, queries which are inserting after selecting from another table can have ClassCastExceptions when trying to insert data in the wong order - this is how we discovered this bug. This problem, however, can be far worse, if there are no type problems - it is possible, for eg., that if a,bc were all strings, that that insert query would succeed but mix up the order, which then results in user table data being mixed up. This has the potential to be very bad. We should write a tool to help convert metastores that use datanucleus2 to datanucleus1(more difficult, needs more one-time testing) or change directSql to support both(easier to code, but increases test-coverage matrix significantly and we should really then be testing against both schemes). But in the short term, we should disable directSql if we see that the identifierfactory is datanucleus2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 34961: HIVE-10895: ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34961/#review88065 --- metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 3513) https://reviews.apache.org/r/34961/#comment140456 closeAll() call will close the result set. So if you return a List or Collection object (which has iterator interface)d directly from query.execute() returned, we can't iterate later through the list anymore. So we need to iterate through the list here. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 3543) https://reviews.apache.org/r/34961/#comment140457 Same comments as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4431) https://reviews.apache.org/r/34961/#comment140458 Same comments as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4460) https://reviews.apache.org/r/34961/#comment140459 Same comments as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4483) https://reviews.apache.org/r/34961/#comment140460 Same comments as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4528) https://reviews.apache.org/r/34961/#comment140461 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4586) https://reviews.apache.org/r/34961/#comment140462 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4621) https://reviews.apache.org/r/34961/#comment140463 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4650) https://reviews.apache.org/r/34961/#comment140464 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4679) https://reviews.apache.org/r/34961/#comment140465 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4710) https://reviews.apache.org/r/34961/#comment140466 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4771) https://reviews.apache.org/r/34961/#comment140467 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4867) https://reviews.apache.org/r/34961/#comment140468 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4903) https://reviews.apache.org/r/34961/#comment140469 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4940) https://reviews.apache.org/r/34961/#comment140470 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 4981) https://reviews.apache.org/r/34961/#comment140471 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5016) https://reviews.apache.org/r/34961/#comment140472 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5050) https://reviews.apache.org/r/34961/#comment140473 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5126) https://reviews.apache.org/r/34961/#comment140474 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5158) https://reviews.apache.org/r/34961/#comment140475 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5355) https://reviews.apache.org/r/34961/#comment140478 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5386) https://reviews.apache.org/r/34961/#comment140477 Same comment as above. metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 5436) https://reviews.apache.org/r/34961/#comment140476 Same comment as above. - Aihua Xu On June 2, 2015, 11:07 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34961/ --- (Updated June 2, 2015, 11:07 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-10895 https://issues.apache.org/jira/browse/HIVE-10895 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-10895 Diffs - metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java fd61333 Diff: https://reviews.apache.org/r/34961/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Created] (HIVE-11022) Support collecting lists in user defined order
Michael Haeusler created HIVE-11022: --- Summary: Support collecting lists in user defined order Key: HIVE-11022 URL: https://issues.apache.org/jira/browse/HIVE-11022 Project: Hive Issue Type: New Feature Components: UDF Reporter: Michael Haeusler Hive currently supports aggregation of lists in order of input rows with the UDF collect_list. Unfortunately, the order is not well defined when map-side aggregations are used. Hive could support collecting lists in user-defined order by providing a UDF COLLECT_LIST_SORTED(valueColumn, sortColumn[, limit]), that would return a list of values sorted in a user defined order. An optional limit parameter can restrict this to the n first values within that order. Especially in the limit case, this can be efficiently pre-aggregated and reduce the amount of data transferred to reducers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11024) Error inserting a date value via parameter marker (PreparedStatement.setDate)
Sergio Lob created HIVE-11024: - Summary: Error inserting a date value via parameter marker (PreparedStatement.setDate) Key: HIVE-11024 URL: https://issues.apache.org/jira/browse/HIVE-11024 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.14.0 Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux Reporter: Sergio Lob Inserting a row with a Date parameter marker (PreparedStatement.setDate()) fails with ParseException: Exception: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:41 mismatched input '-' expecting ) near '1980' in statement org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:41 mismatched input '-' expecting ) near '1980' in statement at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254) at org.apache.hive.jdbc.HiveStatement.executeUpdate(HiveStatement.java:4 06) at org.apache.hive.jdbc.HivePreparedStatement.executeUpdate(HivePrepared Statement.java:117) at repro1.main(repro1.java:90) Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:41 mismatched input '-' expecting ) near '1980' in statement at org.apache.hive.service.cli.operation.Operation.toSQLException(Operat ion.java:314) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperati on.java:102) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOpe ration.java:171) ++ REPRO: -- /* * It may be freely used, modified, and distributed with no restrictions. */ import java.sql.Connection; import java.sql.DatabaseMetaData; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.SQLException; import java.sql.Statement; import java.sql.PreparedStatement; import java.sql.ResultSetMetaData; import java.io.Reader; /** */ public class repro1 { /** * Main method. * * @param args *no arguments required */ public static void main(String [] args) { Connection con = null; Statement stmt = null; ResultSet rst = null; String drptab = DROP TABLE SDLJUNK; String crttab = CREATE TABLE SDLJUNK(I INT, D DATE); String instab = INSERT INTO TABLE SDLJUNK VALUES (1, ? ); try { System.out.println(=); System.out.println(Problem description:); System.out.println(After setting a value for a DATE parameter marker ); System.out.println( with PreparedStatement.setDate(),); System.out.println( an INSERT statement fails execution with error: ); System.out.println( ); System.out.println( Error while compiling statement: FAILED: ); System.out.println(ParseException line 1:78 mismatched input '-' ); System.out.println( expecting ) near '1980' in statement); System.out.println(=); System.out.println(); // Create new instance of JDBC Driver and make connection. System.out.println(Registering Driver.); Class.forName(org.apache.hive.jdbc.HiveDriver); String url=jdbc:hive2://hwhive:1/R72D; System.out.println(Making a connection to: +url); con = DriverManager.getConnection(url, hive, hive); System.out.println(Connection successful.\n); DatabaseMetaData dbmd = con.getMetaData(); System.out.println(getDatabaseProductName() = +dbmd.getDatabaseProductName()); System.out.println(getDatabaseProductVersion() = +dbmd.getDatabaseProductVersion()); System.out.println(getDriverName() = +dbmd.getDriverName()); System.out.println(getDriverVersion() = +dbmd.getDriverVersion()); try { System.out.println(con.createStatement()); stmt = con.createStatement(); System.out.println(drptab); stmt.executeUpdate(drptab); } catch (Exception ex) { System.out.println(Exception: + ex); } System.out.println(crttab); stmt.executeUpdate(crttab); System.out.println(preparing: +instab); PreparedStatement pstmt = con.prepareStatement(instab); System.out.println(calling setDate() for parameter marker); java.sql.Date dt = java.sql.Date.valueOf(1980-12-26); pstmt.setDate(1, dt); //pstmt.setString(1, 1980-12-26); System.out.println(executing: +instab); pstmt.executeUpdate();
Hive-0.14 - Build # 986 - Still Failing
Changes for Build #980 Changes for Build #981 Changes for Build #982 Changes for Build #983 Changes for Build #984 Changes for Build #985 Changes for Build #986 No tests ran. The Apache Jenkins build system has built Hive-0.14 (build #986) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-0.14/986/ to view the results.
Re: Review Request 34897: CBO: Calcite Operator To Hive Operator (Calcite Return Path) Empty tabAlias in columnInfo which triggers PPD
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34897/ --- (Updated June 16, 2015, 9:55 p.m.) Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- in ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 477, when aliases contains empty string and key is an empty string too, it assumes that aliases contains key. This will trigger incorrect PPD. To reproduce it, apply the HIVE-10455 and run cbo_subq_notin.q. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java 9c21238 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverterPostProc.java e7c8342 Diff: https://reviews.apache.org/r/34897/diff/ Testing --- Thanks, pengcheng xiong
[jira] [Created] (HIVE-11026) Make vector_outer_join* test more robust
Ashutosh Chauhan created HIVE-11026: --- Summary: Make vector_outer_join* test more robust Key: HIVE-11026 URL: https://issues.apache.org/jira/browse/HIVE-11026 Project: Hive Issue Type: Test Components: Tests Reporter: Ashutosh Chauhan Different file sizes on different OSes result in different Data Size in explain output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative
Vikram Dixit K created HIVE-11027: - Summary: Hive on tez: Bucket map joins fail when hashcode goes negative Key: HIVE-11027 URL: https://issues.apache.org/jira/browse/HIVE-11027 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0 Reporter: Vikram Dixit K Assignee: Prasanth Jayachandran Seeing an issue when dynamic sort optimization is enabled while doing an insert into bucketed table. We seem to be flipping the negative sign on the hashcode instead of taking the complement of it for routing the data correctly. This results in correctness issues in bucket map joins in hive on tez when the hash code goes negative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 35532: HIVE-11025 In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35532/ --- Review request for hive. Repository: hive-git Description --- HIVE-11025 In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly Diffs - data/files/emp2.txt 650aff7f2c8003fb7c04dfa377c2b25d04f3ce88 ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 32471f2dc864c38a2969909efa5b21508e27d7f8 ql/src/test/queries/clientpositive/windowing_windowspec3.q 608a6cf45e3c1e0b928800dae0470e8acfd77734 ql/src/test/results/clientpositive/windowing_windowspec3.q.out 42c042f2cf80f0a5a8269ad9eb9864d7e76525cc Diff: https://reviews.apache.org/r/35532/diff/ Testing --- Thanks, Aihua Xu
[jira] [Created] (HIVE-11025) In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly
Aihua Xu created HIVE-11025: --- Summary: In windowing spec, when the datatype is decimal, it's comparing the value against NULL value incorrectly Key: HIVE-11025 URL: https://issues.apache.org/jira/browse/HIVE-11025 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Given data and the following query, {noformat} deptno empno bonussalary 307698 NULL2850.0 307900 NULL950.0 307844 0 1500.0 select avg(salary) over (partition by deptno order by bonus range 200 preceding) from emp2; {noformat} It produces incorrect result for the row in which bonus=0 1900.0 1900.0 1766.7 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 35543: CBO: Calcite Operator To Hive Operator (Calcite Return Path): dpCtx's mapInputToDP should depends on the last SEL
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35543/ --- Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- In dynamic partitioning case, for example, we are going to have TS0-SEL1-SEL2-FS3. The dpCtx's mapInputToDP is populated by SEL1 rather than SEL2, which causes error in return path. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 58ee605 Diff: https://reviews.apache.org/r/35543/diff/ Testing --- Thanks, pengcheng xiong
[jira] [Created] (HIVE-11029) hadoop.proxyuser.mapr.groups does not work to restrict the groups that can be impersonated
Na Yang created HIVE-11029: -- Summary: hadoop.proxyuser.mapr.groups does not work to restrict the groups that can be impersonated Key: HIVE-11029 URL: https://issues.apache.org/jira/browse/HIVE-11029 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.2.0, 1.0.0, 0.14.0, 0.13.0 Reporter: Na Yang Assignee: Na Yang In the core-site.xml, the hadoop.proxyuser.user.groups specifies the user groups which can be impersonated by the HS2 user. However, this does not work properly in Hive. In my core-site.xml, I have the following configs: property namehadoop.proxyuser.mapr.hosts/name value*/value /property property namehadoop.proxyuser.mapr.groups/name valueroot/value /property I would expect with this configuration that 'mapr' can impersonate only members of the Unix group 'root'. However if I submit a query as user 'jon' the query is running as user 'jon' even though 'mapr' should not be able to impersonate this user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11028) Tez: table self join and join with another table fails with IndexOutOfBoundsException
Jason Dere created HIVE-11028: - Summary: Tez: table self join and join with another table fails with IndexOutOfBoundsException Key: HIVE-11028 URL: https://issues.apache.org/jira/browse/HIVE-11028 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Jason Dere Assignee: Jason Dere {noformat} create table tez_self_join1(id1 int, id2 string, id3 string); insert into table tez_self_join1 values(1, 'aa','bb'), (2, 'ab','ab'), (3,'ba','ba'); create table tez_self_join2(id1 int); insert into table tez_self_join2 values(1),(2),(3); explain select s.id2, s.id3 from ( select self1.id1, self1.id2, self1.id3 from tez_self_join1 self1 join tez_self_join1 self2 on self1.id2=self2.id3 ) s join tez_self_join2 on s.id1=tez_self_join2.id1 where s.id2='ab'; {noformat} fails with error: {noformat} 2015-06-16 15:41:55,759 ERROR [main]: ql.Driver (SessionState.java:printError(979)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 3, vertexId=vertex_1434494327112_0002_4_04, diagnostics=[Task failed, taskId=task_1434494327112_0002_4_04_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:118) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:109) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:290) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:275) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:175) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:313) at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:71) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:99) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11032) Enable more tests for grouping by skewed data [Spark Branch]
Rui Li created HIVE-11032: - Summary: Enable more tests for grouping by skewed data [Spark Branch] Key: HIVE-11032 URL: https://issues.apache.org/jira/browse/HIVE-11032 Project: Hive Issue Type: Sub-task Reporter: Rui Li Priority: Minor Not all of such tests are enabled, e.g. {{groupby1_map_skew.q}}. We can use this JIRA to track whether we need more of them. Basically, we need to look at all tests with {{set hive.groupby.skewindata=true;}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11033) BloomFilter index is not honored by ORC reader
Allan Yan created HIVE-11033: Summary: BloomFilter index is not honored by ORC reader Key: HIVE-11033 URL: https://issues.apache.org/jira/browse/HIVE-11033 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Allan Yan There is a bug in the org.apache.hadoop.hive.ql.io.orc.ReaderImpl class which caused the bloom filter index saved in the ORC file not being used. The reason is because the bloomFilterIndices variable defined in the SargApplier class superseded from its parent class. Here is one way to fix it {noformat} 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original 174d173 bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; 178c177 sarg, options.getColumnNames(), strideRate, types, included.length, bloomFilterIndices); --- sarg, options.getColumnNames(), strideRate, types, included.length); 204a204 bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; 673c673 ListOrcProto.Type types, int includedCount, OrcProto.BloomFilterIndex[] bloomFilterIndices) { --- ListOrcProto.Type types, int includedCount) { 677c677 this.bloomFilterIndices = bloomFilterIndices; --- bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11034) Multiple join table producing different results
Srini Pindi created HIVE-11034: -- Summary: Multiple join table producing different results Key: HIVE-11034 URL: https://issues.apache.org/jira/browse/HIVE-11034 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Environment: Linux 2.6.32-279.19.1.el6.x86_64 Reporter: Srini Pindi Priority: Critical Join between one main table with other tables with different join columns returns wrong results in hive. Changing the order of the joins between main table and other tables is producing different results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11031) ORC concatenation of old files can fail while merging column statistics
Prasanth Jayachandran created HIVE-11031: Summary: ORC concatenation of old files can fail while merging column statistics Key: HIVE-11031 URL: https://issues.apache.org/jira/browse/HIVE-11031 Project: Hive Issue Type: Bug Affects Versions: 1.2.0, 1.0.0, 1.1.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Column statistics in ORC are optional protobuf fields. Old ORC files might not have statistics for newly added types like decimal, date, timestamp etc. But column statistics merging assumes column statistics exists for these types and invokes merge. For example, merging of TimestampColumnStatistics directly casts the received ColumnStatistics object without doing instanceof check. If the ORC file contains time stamp column statistics then this will work else it will throw ClassCastException. Also, the file merge operator swallows the exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11030) Enhance storage layer to create one delta file per write
Eugene Koifman created HIVE-11030: - Summary: Enhance storage layer to create one delta file per write Key: HIVE-11030 URL: https://issues.apache.org/jira/browse/HIVE-11030 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Currently each txn using ACID insert/update/delete will generate a delta directory like delta_100_101. In order to support multi-statement transactions we must generate one delta per operation within the transaction so the deltas would be named like delta_100_101_0001, etc. Support for MERGE (HIVE-10924) would need the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Chao Sun and Gopal Vijayaraghavan
Congrats ! Damien CAROL - tél : +33 (0)4 74 96 88 14 - email : dca...@blitzbs.com BLITZ BUSINESS SERVICE 2015-06-15 18:41 GMT+02:00 Gunther Hagleitner ghagleit...@hortonworks.com: Congrats Chao and Gopal! Cheers, Gunther. From: amareshwarisr . amareshw...@gmail.com Sent: Sunday, June 14, 2015 9:47 PM To: dev@hive.apache.org Subject: Re: [ANNOUNCE] New Hive PMC Members - Chao Sun and Gopal Vijayaraghavan Congratulations Chao and Gopal ! Thanks, Amareshwari On Thu, Jun 11, 2015 at 2:50 AM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Chao Sun and Gopal Vijayaraghavan have been elected to the Hive Project Management Committee. Please join me in congratulating Chao and Gopal! Thanks. - Carl
Getting ready for 1.2.1
Hi folks, It's been nearly a month since 1.2.0, and when I did that release, I said I'd keep the branch open for any further non-db-changing, non-breaking patches, and from the sheer number of patches registered on the status page, that's been a good idea. Now, I think it's time to start drawing that to a close to see an stabilization update, and I would like to begin the process of rolling out release candidates for 1.2.1. I would like to start rolling out an RC0 by Wednesday night if no one objects. For now, the rules on committing to branch-1.2 remain the same: a) commit to branch-1 master first b) add me as a watcher on that jira c) add the bug to the release status wiki. Once I start the release process, I will once again increase the bar for commits as we did the last time. That said, this time, once we finish the release for 1.2.1, the bar on further commits to branch-1.2 is intended to remain at a higher level, so as to make sure we don't have too much of a back porting hassle - we will soon try to limit our commits to branch-1 and master only. Cheers, -Sushanth
[jira] [Created] (HIVE-11020) support partial scan for analyze command - Avro
Bing Li created HIVE-11020: -- Summary: support partial scan for analyze command - Avro Key: HIVE-11020 URL: https://issues.apache.org/jira/browse/HIVE-11020 Project: Hive Issue Type: Improvement Reporter: Bing Li Assignee: Bing Li This is follow up on HIVE-3958. We already have two similar Jiras - support partial scan for analyze command - ORC https://issues.apache.org/jira/browse/HIVE-4177 - [Parquet] Support Analyze Table with partial scan https://issues.apache.org/jira/browse/HIVE-9491 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11019) Can't create an Avro table with uniontype column correctly
Bing Li created HIVE-11019: -- Summary: Can't create an Avro table with uniontype column correctly Key: HIVE-11019 URL: https://issues.apache.org/jira/browse/HIVE-11019 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Bing Li I tried the example in https://cwiki.apache.org/confluence/display/Hive/AvroSerDe And found that it can't create an AVRO table correctly with uniontype hive create table avro_union(union1 uniontypeFLOAT, BOOLEAN, STRING)STORED AS AVRO; OK Time taken: 0.083 seconds hive describe avro_union; OK union1 uniontypevoid,float,boolean,string Time taken: 0.058 seconds, Fetched: 1 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)