[jira] [Updated] (HIVE-11053) Add more tests for HIVE-10844[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GaoLun updated HIVE-11053: -- Attachment: HIVE-11053.3-spark.patch Add more tests for HIVE-10844[Spark Branch] --- Key: HIVE-11053 URL: https://issues.apache.org/jira/browse/HIVE-11053 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chengxiang Li Assignee: GaoLun Priority: Minor Attachments: HIVE-11053.1-spark.patch, HIVE-11053.2-spark.patch, HIVE-11053.3-spark.patch Add some test cases for self union, self-join, CWE, and repeated sub-queries to verify the job of combining quivalent works in HIVE-10844. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11171) Join reordering algorithm might introduce projects between joins
[ https://issues.apache.org/jira/browse/HIVE-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11171: Fix Version/s: 1.2.2 Join reordering algorithm might introduce projects between joins Key: HIVE-11171 URL: https://issues.apache.org/jira/browse/HIVE-11171 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 2.0.0, 1.2.2 Attachments: HIVE-11171.01.patch, HIVE-11171.02.patch, HIVE-11171.03.patch, HIVE-11171.5.patch, HIVE-11171.patch, HIVE-11171.patch Join reordering algorithm might introduce projects between joins which causes multijoin optimization in SemanticAnalyzer to not kick in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10292) Add support for HS2 to use custom authentication class with kerberos environment
[ https://issues.apache.org/jira/browse/HIVE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HeeSoo Kim reassigned HIVE-10292: - Assignee: HeeSoo Kim Add support for HS2 to use custom authentication class with kerberos environment Key: HIVE-10292 URL: https://issues.apache.org/jira/browse/HIVE-10292 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 1.1.0 Reporter: Heesoo Kim Assignee: HeeSoo Kim Attachments: HIVE-10292.patch In the kerberos environment, Hiveserver2 only supports GSSAPI and DIGEST-MD5 authentication mechanism. We would like to add the ability to use custom authentication class in conjunction with Kerberos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10882: --- Summary: CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception (was: CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filterMap of join operator causes NPE exception) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10882) CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception
[ https://issues.apache.org/jira/browse/HIVE-10882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-10882. Resolution: Cannot Reproduce CBO: Calcite Operator To Hive Operator (Calcite Return Path) empty filters of join operator causes NPE exception Key: HIVE-10882 URL: https://issues.apache.org/jira/browse/HIVE-10882 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez CBO return path creates join operator with empty filters. However, vectorization is checking the filters of bigTable in join. This causes NPE exception. To reproduce, run vector_outer_join2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance
[ https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11131: --- Attachment: HIVE-11131.4.patch Get row information on DataWritableWriter once for better writing performance - Key: HIVE-11131 URL: https://issues.apache.org/jira/browse/HIVE-11131 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, HIVE-11131.4.patch DataWritableWriter is a class used to write Hive records to Parquet files. This class is getting all the information about how to parse a record, such as schema and object inspector, every time a record is written (or write() is called). We can make this class perform better by initializing some writers per data type once, and saving all object inspectors on each writer. The class expects that the next records written will have the same object inspectors and schema, so there is no need to have conditions for that. When a new schema is written, DataWritableWriter is created again by Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11192) Wrong results for query with WHERE ... NOT IN when table has null values
[ https://issues.apache.org/jira/browse/HIVE-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616963#comment-14616963 ] Furcy Pin commented on HIVE-11192: -- My apologies, this was not a bug but the expected (confusing IMHO) behavior for SQL. SELECT 1 IN (1,2,3,NULL) ; true SELECT 1 IN (2,3) ; false SELECT 1 IN (2,3,NULL) ; NULL SELECT 1 NOT IN (1,2,3,NULL) ; false SELECT 1 NOT IN (2,3,NULL) ; NULL SELECT 1 NOT IN (2,3) ; true Wrong results for query with WHERE ... NOT IN when table has null values Key: HIVE-11192 URL: https://issues.apache.org/jira/browse/HIVE-11192 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 1.2.1 Environment: Hive on MR Reporter: Furcy Pin I tested this on cdh5.4.2 cluster and locally on the release-1.2.1 branch ```sql DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; CREATE TABLE test1 (col1 STRING) ; INSERT INTO TABLE test1 VALUES (1), (2), (3), (4) ; CREATE TABLE test2 (col1 STRING) ; INSERT INTO TABLE test2 VALUES (1), (4), (NULL) ; SELECT COUNT(1) FROM test1 T1 WHERE T1.col1 NOT IN (SELECT col1 FROM test2) ; SELECT COUNT(1) FROM test1 T1 WHERE T1.col1 NOT IN (SELECT col1 FROM test2 WHERE col1 IS NOT NULL) ; ``` The first query returns 0 and the second returns 2. Obviously, the expected answer is always 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11053) Add more tests for HIVE-10844[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617019#comment-14617019 ] Hive QA commented on HIVE-11053: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12743977/HIVE-11053.3-spark.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7993 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.initializationError org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned org.apache.hive.jdbc.TestSSL.testSSLConnectionWithURL {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/923/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/923/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-923/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12743977 - PreCommit-HIVE-SPARK-Build Add more tests for HIVE-10844[Spark Branch] --- Key: HIVE-11053 URL: https://issues.apache.org/jira/browse/HIVE-11053 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chengxiang Li Assignee: GaoLun Priority: Minor Attachments: HIVE-11053.1-spark.patch, HIVE-11053.2-spark.patch, HIVE-11053.3-spark.patch Add some test cases for self union, self-join, CWE, and repeated sub-queries to verify the job of combining quivalent works in HIVE-10844. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11184) Lineage - ExprProcFactory#getExprString may throw NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616975#comment-14616975 ] Chao Sun commented on HIVE-11184: - +1 Lineage - ExprProcFactory#getExprString may throw NullPointerException -- Key: HIVE-11184 URL: https://issues.apache.org/jira/browse/HIVE-11184 Project: Hive Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 2.0.0 Attachments: HIVE-11184.1.patch ColumnInfo may have null alias. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617059#comment-14617059 ] Thejas M Nair commented on HIVE-11190: -- [~dapengsun] When V2 authorization mode is used, the HiveAuthorizer.filterListCmdObjects is expected to be called by authorization implementations. The SessionState.setAuthorizerV2Config call sets up the config so that things work appropriately for V2 authorization, and it includes this change. My concern is that if user might set this field to a custom value without ensuring that the HiveAuthorizer call also gets made. What is the use case you have in mind ? Are you using sql standard authorization in this case or another custom V2 authorization implementation ? Would it work to move the logic from your custom filter hook to HiveAuthorizer.filterListCmdObjects implementation ? ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default. it will cause user failed to customize the METASTORE_FILTER_HOOK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10838) Allow the Hive metastore client to bind to a specific address when connecting to the server
[ https://issues.apache.org/jira/browse/HIVE-10838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HIVE-10838: Affects Version/s: 1.2.0 Allow the Hive metastore client to bind to a specific address when connecting to the server --- Key: HIVE-10838 URL: https://issues.apache.org/jira/browse/HIVE-10838 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: HeeSoo Kim Assignee: HeeSoo Kim Attachments: HIVE-10838.patch +*In a cluster with Kerberos authentication*+ When a Hive metastore client (e.g. HS2, oozie) has been configured with a logical hostname (e.g. hiveserver/hiveserver_logical_hostn...@example.com), it still uses its physical hostname to try to connect to the hive metastore. For example, we specifiy, in hive-site.xml: {noformat} property namehive.server2.authentication.kerberos.principal/name valuehiveserver/hiveserver_logical_hostn...@example.com/value /property {noformat} When the client tried to get a delegation token from the metastore, an exception occurred: {noformat} 2015-05-21 23:17:59,554 ERROR metadata.Hive (Hive.java:getDelegationToken(2638)) - MetaException(message:Unauthorized connection for super-user: hiveserver/hiveserver_logical_hostn...@example.com from IP 10.250.16.43) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore.java) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_delegation_token(ThriftHiveMetastore.java:3293) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_delegation_token(ThriftHiveMetastore.java:3279) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1559) {noformat} We need to set the bind address when Hive metastore client tries to connect Hive metastore based on logical hostname of Kerberos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-9557) create UDF to measure strings similarity using Cosine Similarity algo
[ https://issues.apache.org/jira/browse/HIVE-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov reopened HIVE-9557: --- I take it create UDF to measure strings similarity using Cosine Similarity algo - Key: HIVE-9557 URL: https://issues.apache.org/jira/browse/HIVE-9557 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Labels: CosineSimilarity, SimilarityMetric, UDF algo description http://en.wikipedia.org/wiki/Cosine_similarity {code} --one word different, total 2 words str_sim_cosine('Test String1', 'Test String2') = (2 - 1) / 2 = 0.5f {code} reference implementation: https://github.com/Simmetrics/simmetrics/blob/master/src/uk/ac/shef/wit/simmetrics/similaritymetrics/CosineSimilarity.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11188) Make ORCFile's String Dictionary more efficient
[ https://issues.apache.org/jira/browse/HIVE-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-11188: -- Description: Currently, ORCFile's String Dictionary uses StringRedBlackTree for adding/finding/sorting duplicate strings. When there are a large number of unique strings (let's say over 16K) and a large number of rows (let's say 1M), the binary search will take O(1M * log(16K)) time which can be very long. Alternatively, ORCFile's String Dictionary can use HashMap for adding/finding duplicate strings, and use quicksort at the end to produce a sorted order. In the same case above, the total time spent will be O(1M + 16K * log(16K)) which is much faster. When the number of unique string is close to the number of rows (let's say, both around 1M), ORC will automatically disable the dictionary encoding. In the old approach will take O(1M * log(1M)), and our new approach will take O(1M) since we can skip the final quicksort if the dictionary encoding is disabled. So in either case, the new approach should be a win. Here is an PMP output based on ~600 traces (so 126 means 126/600 ~= 21% of total time). It's a query like INSERT OVERWRITE TABLE target SELECT * FROM src using hive-1.1.0-cdh-5.4.1. target TABLE is STORED AS ORC, and src TABLE is STORED AS RCFILE. 126 org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.compareValue(StringRedBlackTree.java:67) 35 java.util.zip.Deflater.deflateBytes(Native Method) 26 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.findClosestNumBits(SerializationUtils.java:218) 24 org.apache.hadoop.hive.serde2.lazy.LazyNonPrimitive.isNull(LazyNonPrimitive.java:63) 22 org.apache.hadoop.hive.serde2.lazy.LazyMap.parse(LazyMap.java:204) 22 org.apache.hadoop.hive.serde2.lazy.LazyLong.parseLong(LazyLong.java:116) 21 org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111) 19 org.apache.hadoop.hive.serde2.lazy.LazyPrimitive.hashCode(LazyPrimitive.java:57) 18 org.apache.hadoop.hive.ql.io.orc.RedBlackTree.getRight(RedBlackTree.java:99) 16 org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1932) 15 org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native Method) 15 org.apache.hadoop.hive.ql.io.orc.WriterImpl$IntegerTreeWriter.write(WriterImpl.java:929) 12 org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1607) 12 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) 11 org.apache.hadoop.hive.ql.io.orc.RedBlackTree.getLeft(RedBlackTree.java:92) 11 org.apache.hadoop.hive.ql.io.orc.DynamicIntArray.add(DynamicIntArray.java:105) 10 org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) ... was: Currently, ORCFile's String Dictionary uses StringRedBlackTree for adding/finding/sorting duplicate strings. When there are a large number of unique strings (let's say over 16K) and a large number of rows (let's say 1M), the binary search will take O(1M * log(16K)) time which can be very long. Alternatively, ORCFile's String Dictionary can use HashMap for adding/finding duplicate strings, and use quicksort at the end to produce a sorted order. In the same case above, the total time spent will be O(1M + 16K * log(16K)) which is much faster. When the number of unique string is close to the number of rows (let's say, both around 1M), ORC will automatically disable the dictionary encoding. In the old approach will take O(1M * log(1M)), and our new approach will take O(1M) since we can skip the final quicksort if the dictionary encoding is disabled. So in either case, the new approach should be a win. Here is an PMP output based on ~600 traces (so 126 means 126/600 ~= 21% of total time). It's a query like INSERT OVERWRITE TABLE SELECT * FROM src using hive-1.1.0-cdh-5.4.1. 126 org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.compareValue(StringRedBlackTree.java:67) 35 java.util.zip.Deflater.deflateBytes(Native Method) 26 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.findClosestNumBits(SerializationUtils.java:218) 24 org.apache.hadoop.hive.serde2.lazy.LazyNonPrimitive.isNull(LazyNonPrimitive.java:63) 22 org.apache.hadoop.hive.serde2.lazy.LazyMap.parse(LazyMap.java:204) 22 org.apache.hadoop.hive.serde2.lazy.LazyLong.parseLong(LazyLong.java:116) 21 org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111) 19 org.apache.hadoop.hive.serde2.lazy.LazyPrimitive.hashCode(LazyPrimitive.java:57) 18 org.apache.hadoop.hive.ql.io.orc.RedBlackTree.getRight(RedBlackTree.java:99) 16 org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1932) 15
[jira] [Commented] (HIVE-10791) Beeline-CLI: Implement in-place update UI for CLI compatibility
[ https://issues.apache.org/jira/browse/HIVE-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616302#comment-14616302 ] Ferdinand Xu commented on HIVE-10791: - Hi [~gopalv], do you mean the hive web interface in this jira? https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface Beeline-CLI: Implement in-place update UI for CLI compatibility --- Key: HIVE-10791 URL: https://issues.apache.org/jira/browse/HIVE-10791 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Gopal V Priority: Critical The current CLI implementation has an in-place updating UI which offers a clear picture of execution runtime and failures. This is designed for large DAGs which have more than 10 verticles, where the old UI would scroll sideways. The new CLI implementation needs to keep up the usability standards set by the old one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11110) Enable HiveJoinAddNotNullRule in CBO
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-0: Attachment: HIVE-0-branch-1.2.patch Patch for branch-1.2 Enable HiveJoinAddNotNullRule in CBO Key: HIVE-0 URL: https://issues.apache.org/jira/browse/HIVE-0 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-0-branch-1.2.patch, HIVE-0.1.patch, HIVE-0.2.patch, HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.patch Query {code} select count(*) from store_sales ,store_returns ,date_dim d1 ,date_dim d2 where d1.d_quarter_name = '2000Q1' and d1.d_date_sk = ss_sold_date_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); {code} The store_sales table is partitioned on ss_sold_date_sk, which is also used in a join clause. The join clause should add a filter “filterExpr: ss_sold_date_sk is not null”, which should get pushed the MetaStore when fetching the stats. Currently this is not done in CBO planning, which results in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in the optimization phase. In particular, this increases the NDV for the join columns and may result in wrong planning. Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-6859) Test JIRA
[ https://issues.apache.org/jira/browse/HIVE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-6859: --- Assignee: Szehon Ho Test JIRA - Key: HIVE-6859 URL: https://issues.apache.org/jira/browse/HIVE-6859 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6859.1.patch, HIVE-6859.2.patch, HIVE-6859.patch, HIVE-6891.4.patch, HIVE-6891.5.patch, HIVE-6891.6.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11118) Load data query should validate file formats with destination tables
[ https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617271#comment-14617271 ] Sushanth Sowmyan commented on HIVE-8: - I have a question here - I will open another bug if need be, but if it's a simple misunderstanding, it won't matter. From the patch, I see the following bit: {code} 337 private void ensureFileFormatsMatch(TableSpec ts, URI fromURI) throws SemanticException { 338 Class? extends InputFormat destInputFormat = ts.tableHandle.getInputFormatClass(); 339 // Other file formats should do similar check to make sure file formats match 340 // when doing LOAD DATA .. INTO TABLE 341 if (OrcInputFormat.class.equals(destInputFormat)) { 342 Path inputFilePath = new Path(fromURI); 343 try { 344 FileSystem fs = FileSystem.get(fromURI, conf); 345 // just creating orc reader is going to do sanity checks to make sure its valid ORC file 346 OrcFile.createReader(fs, inputFilePath); 347 } catch (FileFormatException e) { 348 throw new SemanticException(ErrorMsg.INVALID_FILE_FORMAT_IN_LOAD.getMsg(Destination + 349 table is stored as ORC but the file being loaded is not a valid ORC file.)); 350 } catch (IOException e) { 351 throw new SemanticException(Unable to load data to destination table. + 352 Error: + e.getMessage()); 353 } 354 } 355 } {code} Now, it's entirely possible that the table in question is an ORC table, but the partition being loaded is of another format, such as Text - Hive supports mixed partition scenarios. In fact, this is a likely scenario in the case of a replication of a table that used to be Text, but has been converted to Orc, so that all new partitions will be orc. Then, in that case, the destination table will be a MANAGED_TABLE, and will be an orc table, but import will try to load a text partition on to it. Shouldn't this refer to a partitionspec rather than the table's inputformat for this check to work with that scenario? Load data query should validate file formats with destination tables Key: HIVE-8 URL: https://issues.apache.org/jira/browse/HIVE-8 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.3.0, 2.0.0 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, HIVE-8.4.patch, HIVE-8.patch Load data local inpath queries does not do any validation wrt file format. If the destination table is ORC and if we try to load files that are not ORC, the load will succeed but querying such tables will result in runtime exceptions. We can do some simple sanity checks to prevent loading of files that does not match the destination table file format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11181) Update people change for new PMC members
[ https://issues.apache.org/jira/browse/HIVE-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617218#comment-14617218 ] Chao Sun commented on HIVE-11181: - Hi [~Ferd], can you give a review on this? Thanks. Update people change for new PMC members Key: HIVE-11181 URL: https://issues.apache.org/jira/browse/HIVE-11181 Project: Hive Issue Type: Task Components: Website Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-11181.patch As suggested in the title. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11141) Improve RuleRegExp when the Expression node stack gets huge
[ https://issues.apache.org/jira/browse/HIVE-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11141: - Attachment: HIVE-11141.5.patch Improve RuleRegExp when the Expression node stack gets huge --- Key: HIVE-11141 URL: https://issues.apache.org/jira/browse/HIVE-11141 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11141.1.patch, HIVE-11141.2.patch, HIVE-11141.3.patch, HIVE-11141.4.patch, HIVE-11141.5.patch, SQLQuery10.sql.mssql, createtable.rtf Hive occasionally gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules I have attached the query for the test case which needs to be tested after we setup database as shown below. {code} create database dataset_3; use database dataset_3; {code} createtable.rtf - create table command SQLQuery10.sql.mssql - explain query It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java {code} @Override public int cost(StackNode stack) throws SemanticException { int numElems = (stack != null ? stack.size() : 0); String name = ; for (int pos = numElems - 1; pos = 0; pos--) { name = stack.get(pos).getName() + % + name; Matcher m = pattern.matcher(name); if (m.matches()) { return m.group().length(); } } return -1; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11078) Enhance DbLockManger to support multi-statement txns
[ https://issues.apache.org/jira/browse/HIVE-11078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617255#comment-14617255 ] Eugene Koifman commented on HIVE-11078: --- HIVE-10165 needs a deadlock detector as well Enhance DbLockManger to support multi-statement txns Key: HIVE-11078 URL: https://issues.apache.org/jira/browse/HIVE-11078 Project: Hive Issue Type: Sub-task Components: Locking, Transactions Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman need to build deadlock detection, etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6859) Test JIRA
[ https://issues.apache.org/jira/browse/HIVE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6859: Attachment: HIVE-6891.7.patch Test JIRA - Key: HIVE-6859 URL: https://issues.apache.org/jira/browse/HIVE-6859 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6859.1.patch, HIVE-6859.2.patch, HIVE-6859.patch, HIVE-6891.4.patch, HIVE-6891.5.patch, HIVE-6891.6.patch, HIVE-6891.7.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11141) Improve RuleRegExp when the Expression node stack gets huge
[ https://issues.apache.org/jira/browse/HIVE-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11141: - Attachment: (was: HIVE-11141.5.patch) Improve RuleRegExp when the Expression node stack gets huge --- Key: HIVE-11141 URL: https://issues.apache.org/jira/browse/HIVE-11141 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11141.1.patch, HIVE-11141.2.patch, HIVE-11141.3.patch, HIVE-11141.4.patch, HIVE-11141.5.patch, SQLQuery10.sql.mssql, createtable.rtf Hive occasionally gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules I have attached the query for the test case which needs to be tested after we setup database as shown below. {code} create database dataset_3; use database dataset_3; {code} createtable.rtf - create table command SQLQuery10.sql.mssql - explain query It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java {code} @Override public int cost(StackNode stack) throws SemanticException { int numElems = (stack != null ? stack.size() : 0); String name = ; for (int pos = numElems - 1; pos = 0; pos--) { name = stack.get(pos).getName() + % + name; Matcher m = pattern.matcher(name); if (m.matches()) { return m.group().length(); } } return -1; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11172) Vectorization wrong results for aggregate query with where clause without group by
[ https://issues.apache.org/jira/browse/HIVE-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11172: - Attachment: HIVE-11172.2.patch Vectorization wrong results for aggregate query with where clause without group by -- Key: HIVE-11172 URL: https://issues.apache.org/jira/browse/HIVE-11172 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.14.0 Reporter: Yi Zhang Assignee: Hari Sankar Sivarama Subramaniyan Priority: Critical Attachments: HIVE-11172.1.patch, HIVE-11172.2.patch create table testvec(id int, dt int, greg_dt string) stored as orc; insert into table testvec values (1,20150330, '2015-03-30'), (2,20150301, '2015-03-01'), (3,20150502, '2015-05-02'), (4,20150401, '2015-04-01'), (5,20150313, '2015-03-13'), (6,20150314, '2015-03-14'), (7,20150404, '2015-04-04'); hive select dt, greg_dt from testvec where id=5; OK 20150313 2015-03-13 Time taken: 4.435 seconds, Fetched: 1 row(s) hive set hive.vectorized.execution.enabled=true; hive set hive.map.aggr; hive.map.aggr=true hive select max(dt), max(greg_dt) from testvec where id=5; OK 20150313 2015-03-30 hive set hive.vectorized.execution.enabled=false; hive select max(dt), max(greg_dt) from testvec where id=5; OK 20150313 2015-03-13 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11187) LLAP: clean up shuffle directories when DAG completes
[ https://issues.apache.org/jira/browse/HIVE-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11187: Summary: LLAP: clean up shuffle directories when DAG completes (was: LLAP: clean up shuffle directories and cached hashtables when DAG completes) LLAP: clean up shuffle directories when DAG completes - Key: HIVE-11187 URL: https://issues.apache.org/jira/browse/HIVE-11187 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11141) Improve RuleRegExp when the Expression node stack gets huge
[ https://issues.apache.org/jira/browse/HIVE-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11141: - Attachment: HIVE-11141.5.patch Improve RuleRegExp when the Expression node stack gets huge --- Key: HIVE-11141 URL: https://issues.apache.org/jira/browse/HIVE-11141 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11141.1.patch, HIVE-11141.2.patch, HIVE-11141.3.patch, HIVE-11141.4.patch, HIVE-11141.5.patch, SQLQuery10.sql.mssql, createtable.rtf Hive occasionally gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules I have attached the query for the test case which needs to be tested after we setup database as shown below. {code} create database dataset_3; use database dataset_3; {code} createtable.rtf - create table command SQLQuery10.sql.mssql - explain query It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java {code} @Override public int cost(StackNode stack) throws SemanticException { int numElems = (stack != null ? stack.size() : 0); String name = ; for (int pos = numElems - 1; pos = 0; pos--) { name = stack.get(pos).getName() + % + name; Matcher m = pattern.matcher(name); if (m.matches()) { return m.group().length(); } } return -1; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11118) Load data query should validate file formats with destination tables
[ https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617264#comment-14617264 ] Sushanth Sowmyan commented on HIVE-8: - Thanks, [~leftylev]! Added. Load data query should validate file formats with destination tables Key: HIVE-8 URL: https://issues.apache.org/jira/browse/HIVE-8 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.3.0, 2.0.0 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, HIVE-8.4.patch, HIVE-8.patch Load data local inpath queries does not do any validation wrt file format. If the destination table is ORC and if we try to load files that are not ORC, the load will succeed but querying such tables will result in runtime exceptions. We can do some simple sanity checks to prevent loading of files that does not match the destination table file format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11118) Load data query should validate file formats with destination tables
[ https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8: Fix Version/s: 2.0.0 1.3.0 Load data query should validate file formats with destination tables Key: HIVE-8 URL: https://issues.apache.org/jira/browse/HIVE-8 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.3.0, 2.0.0 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, HIVE-8.4.patch, HIVE-8.patch Load data local inpath queries does not do any validation wrt file format. If the destination table is ORC and if we try to load files that are not ORC, the load will succeed but querying such tables will result in runtime exceptions. We can do some simple sanity checks to prevent loading of files that does not match the destination table file format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11104) Select operator doesn't propagate constants appearing in expressions
[ https://issues.apache.org/jira/browse/HIVE-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-11104: Fix Version/s: 1.3.0 Select operator doesn't propagate constants appearing in expressions Key: HIVE-11104 URL: https://issues.apache.org/jira/browse/HIVE-11104 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11104.2.patch, HIVE-11104.3.patch, HIVE-11104.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11005) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : Regression on the latest master
[ https://issues.apache.org/jira/browse/HIVE-11005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-11005. Resolution: Cannot Reproduce These fails have been fixed as part of HIVE-10533. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : Regression on the latest master -- Key: HIVE-11005 URL: https://issues.apache.org/jira/browse/HIVE-11005 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Jesus Camacho Rodriguez Test cbo_join.q and cbo_views.q on return path failed. Part of the stack trace is {code} 2015-06-15 09:51:53,377 ERROR [main]: parse.CalcitePlanner (CalcitePlanner.java:genOPTree(282)) - CBO failed, skipping CBO. java.lang.IndexOutOfBoundsException: index (0) must be less than size (0) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284) at com.google.common.collect.EmptyImmutableList.get(EmptyImmutableList.java:80) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveInsertExchange4JoinRule.onMatch(HiveInsertExchange4JoinRule.java:101) at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:326) at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:515) at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392) at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255) at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125) at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207) at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:888) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:771) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:876) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11110) Enable HiveJoinAddNotNullRule in CBO
[ https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-0: Attachment: HIVE-0.6.patch Reordered rules so that {{is not null}} filter gets properly attached and pushed to table scans. Enable HiveJoinAddNotNullRule in CBO Key: HIVE-0 URL: https://issues.apache.org/jira/browse/HIVE-0 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-0.1.patch, HIVE-0.2.patch, HIVE-0.4.patch, HIVE-0.5.patch, HIVE-0.6.patch, HIVE-0.patch Query {code} select count(*) from store_sales ,store_returns ,date_dim d1 ,date_dim d2 where d1.d_quarter_name = '2000Q1' and d1.d_date_sk = ss_sold_date_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); {code} The store_sales table is partitioned on ss_sold_date_sk, which is also used in a join clause. The join clause should add a filter “filterExpr: ss_sold_date_sk is not null”, which should get pushed the MetaStore when fetching the stats. Currently this is not done in CBO planning, which results in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in the optimization phase. In particular, this increases the NDV for the join columns and may result in wrong planning. Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11192) Wrong results for query with WHERE ... NOT IN when table has null values
[ https://issues.apache.org/jira/browse/HIVE-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-11192. - Resolution: Not A Problem Wrong results for query with WHERE ... NOT IN when table has null values Key: HIVE-11192 URL: https://issues.apache.org/jira/browse/HIVE-11192 Project: Hive Issue Type: Bug Affects Versions: 1.1.0, 1.2.1 Environment: Hive on MR Reporter: Furcy Pin I tested this on cdh5.4.2 cluster and locally on the release-1.2.1 branch ```sql DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; CREATE TABLE test1 (col1 STRING) ; INSERT INTO TABLE test1 VALUES (1), (2), (3), (4) ; CREATE TABLE test2 (col1 STRING) ; INSERT INTO TABLE test2 VALUES (1), (4), (NULL) ; SELECT COUNT(1) FROM test1 T1 WHERE T1.col1 NOT IN (SELECT col1 FROM test2) ; SELECT COUNT(1) FROM test1 T1 WHERE T1.col1 NOT IN (SELECT col1 FROM test2 WHERE col1 IS NOT NULL) ; ``` The first query returns 0 and the second returns 2. Obviously, the expected answer is always 2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617065#comment-14617065 ] Thejas M Nair commented on HIVE-11190: -- A nit - conf.get(ConfVars.METASTORE_FILTER_HOOK.name(), ConfVars.METASTORE_FILTER_HOOK.getDefaultValue()) can be replaced with equivalent, and more readable form supported by HiveConf - conf.getVar(ConfVars.METASTORE_FILTER_HOOK) ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default. it will cause user failed to customize the METASTORE_FILTER_HOOK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11190) ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default
[ https://issues.apache.org/jira/browse/HIVE-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617066#comment-14617066 ] Thejas M Nair commented on HIVE-11190: -- [~Ferd] Thanks for bringing this to my attention. Really appreciate it. (Its hard to keep track with volume of changes in hive!) ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default Key: HIVE-11190 URL: https://issues.apache.org/jira/browse/HIVE-11190 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11190.001.patch ConfVars.METASTORE_FILTER_HOOK in authorization V2 should not be hard code when the value is not default. it will cause user failed to customize the METASTORE_FILTER_HOOK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10553) Remove hardcoded Parquet references from SearchArgumentImpl
[ https://issues.apache.org/jira/browse/HIVE-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-10553: - Attachment: HIVE-10553.patch Rebased to current trunk. Remove hardcoded Parquet references from SearchArgumentImpl --- Key: HIVE-10553 URL: https://issues.apache.org/jira/browse/HIVE-10553 Project: Hive Issue Type: Sub-task Reporter: Gopal V Assignee: Owen O'Malley Attachments: HIVE-10553.patch, HIVE-10553.patch, HIVE-10553.patch, HIVE-10553.patch SARGs currently depend on Parquet code, which causes a tight coupling between parquet releases and storage-api versions. Move Parquet code out to its own RecordReader, similar to ORC's SargApplier implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11188) Make ORCFile's String Dictionary more efficient
[ https://issues.apache.org/jira/browse/HIVE-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617407#comment-14617407 ] Kevin Wilfong commented on HIVE-11188: -- We did something like this in DWRF https://github.com/facebook/hive-dwrf/commit/c9205c3894cb04453a790de28270d8118a87101d https://github.com/facebook/hive-dwrf/commit/1e920729a0b3a6887194a54a070e2471fac947d2 Make ORCFile's String Dictionary more efficient --- Key: HIVE-11188 URL: https://issues.apache.org/jira/browse/HIVE-11188 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 1.2.0, 1.1.0 Reporter: Zheng Shao Currently, ORCFile's String Dictionary uses StringRedBlackTree for adding/finding/sorting duplicate strings. When there are a large number of unique strings (let's say over 16K) and a large number of rows (let's say 1M), the binary search will take O(1M * log(16K)) time which can be very long. Alternatively, ORCFile's String Dictionary can use HashMap for adding/finding duplicate strings, and use quicksort at the end to produce a sorted order. In the same case above, the total time spent will be O(1M + 16K * log(16K)) which is much faster. When the number of unique string is close to the number of rows (let's say, both around 1M), ORC will automatically disable the dictionary encoding. In the old approach will take O(1M * log(1M)), and our new approach will take O(1M) since we can skip the final quicksort if the dictionary encoding is disabled. So in either case, the new approach should be a win. Here is an PMP output based on ~600 traces (so 126 means 126/600 ~= 21% of total time). It's a query like INSERT OVERWRITE TABLE target SELECT * FROM src using hive-1.1.0-cdh-5.4.1. target TABLE is STORED AS ORC, and src TABLE is STORED AS RCFILE. 126 org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.compareValue(StringRedBlackTree.java:67) 35 java.util.zip.Deflater.deflateBytes(Native Method) 26 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.findClosestNumBits(SerializationUtils.java:218) 24 org.apache.hadoop.hive.serde2.lazy.LazyNonPrimitive.isNull(LazyNonPrimitive.java:63) 22 org.apache.hadoop.hive.serde2.lazy.LazyMap.parse(LazyMap.java:204) 22 org.apache.hadoop.hive.serde2.lazy.LazyLong.parseLong(LazyLong.java:116) 21 org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111) 19 org.apache.hadoop.hive.serde2.lazy.LazyPrimitive.hashCode(LazyPrimitive.java:57) 18 org.apache.hadoop.hive.ql.io.orc.RedBlackTree.getRight(RedBlackTree.java:99) 16 org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1932) 15 org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect(Native Method) 15 org.apache.hadoop.hive.ql.io.orc.WriterImpl$IntegerTreeWriter.write(WriterImpl.java:929) 12 org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1607) 12 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) 11 org.apache.hadoop.hive.ql.io.orc.RedBlackTree.getLeft(RedBlackTree.java:92) 11 org.apache.hadoop.hive.ql.io.orc.DynamicIntArray.add(DynamicIntArray.java:105) 10 org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10927: - Attachment: HIVE-10927.2.patch Addressing review comments. Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11013: Summary: MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) (was: LLAP: MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11013: Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-7926) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted
[ https://issues.apache.org/jira/browse/HIVE-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617389#comment-14617389 ] Wei Zheng commented on HIVE-11193: -- [~ashutoshc] Can you take a look? ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted --- Key: HIVE-11193 URL: https://issues.apache.org/jira/browse/HIVE-11193 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-11193.01.patch During Constant Propagation optimization, sometimes a node ends up being added to opToDelete list more than once. Later in ConstantPropagate transform, we try to delete that operator multiple times, which will cause SemanticException since the node has already been removed in an earlier pass. The data structure for storing opToDelete is List. We should use Set to avoid the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11196) Utilities.getPartitionDesc() should try to reuse TableDesc object
[ https://issues.apache.org/jira/browse/HIVE-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11196: - Summary: Utilities.getPartitionDesc() should try to reuse TableDesc object (was: Utilities.getPartitionDesc() Should try to reuse TableDesc object ) Utilities.getPartitionDesc() should try to reuse TableDesc object -- Key: HIVE-11196 URL: https://issues.apache.org/jira/browse/HIVE-11196 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Currently, Utilities.getPartitionDesc() creates a new PartitionDesc object which inturn creates new TableDesc object via Utilities.getTableDesc(part.getTable()) for every call. This value needs to be reused so that we can avoid the expense of creating new Descriptor object wherever possible -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11118) Load data query should validate file formats with destination tables
[ https://issues.apache.org/jira/browse/HIVE-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617317#comment-14617317 ] Prasanth Jayachandran commented on HIVE-8: -- [~sushanth] Thanks for looking into this. Yes. Its entirely possible that table desc is ORC and partition desc is of other formats. Hive supports that. I missed that part when I put up the patch. The check should use the partition desc for partitioned table instead of table desc throughout. Can you please create a separate bug for it? I will address it shortly. Load data query should validate file formats with destination tables Key: HIVE-8 URL: https://issues.apache.org/jira/browse/HIVE-8 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 1.3.0, 2.0.0 Attachments: HIVE-8.2.patch, HIVE-8.3.patch, HIVE-8.4.patch, HIVE-8.patch Load data local inpath queries does not do any validation wrt file format. If the destination table is ORC and if we try to load files that are not ORC, the load will succeed but querying such tables will result in runtime exceptions. We can do some simple sanity checks to prevent loading of files that does not match the destination table file format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11195) Make auto_sortmerge_join_16.q result sequence more stable
[ https://issues.apache.org/jira/browse/HIVE-11195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong resolved HIVE-11195. Resolution: Fixed Make auto_sortmerge_join_16.q result sequence more stable - Key: HIVE-11195 URL: https://issues.apache.org/jira/browse/HIVE-11195 Project: Hive Issue Type: Improvement Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Priority: Trivial adding -- SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11016) MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions)
[ https://issues.apache.org/jira/browse/HIVE-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617492#comment-14617492 ] Vikram Dixit K commented on HIVE-11016: --- Commit required for branch 1.2, branch-1 and trunk. MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions) Key: HIVE-11016 URL: https://issues.apache.org/jira/browse/HIVE-11016 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11016.01.patch, HIVE-11016.patch Didn't spend a lot of time investigating, but from the code it looks like we shouldn't be calling it after false at least on this path (after false from next, pushRecord returns false, which causes fetchDone to be set for the tag; and fetchOneRow is not called if that is set; should be ok unless tags are messed up?) {noformat} 2015-06-15 17:28:17,272 ERROR [main]: SessionState (SessionState.java:printError(984)) - Vertex failed, vertexName=Reducer 2, vertexId=vertex_1434414363282_0002_17_03, diagnostics=[Task failed, taskId=task_1434414363282_0002_17_03_02, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1434414363282_0002_17_03_02_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:338) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:380) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:449) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:651) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:314) ... 15 more Caused by: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404) ... 20 more Caused by: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:223) at
[jira] [Updated] (HIVE-11193) ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted
[ https://issues.apache.org/jira/browse/HIVE-11193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-11193: - Attachment: HIVE-11193.01.patch Attach patch 1. ConstantPropagateProcCtx should use a Set instead of a List to hold operators to be deleted --- Key: HIVE-11193 URL: https://issues.apache.org/jira/browse/HIVE-11193 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-11193.01.patch During Constant Propagation optimization, sometimes a node ends up being added to opToDelete list more than once. Later in ConstantPropagate transform, we try to delete that operator multiple times, which will cause SemanticException since the node has already been removed in an earlier pass. The data structure for storing opToDelete is List. We should use Set to avoid the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11196) Utilities.getPartitionDesc() should try to reuse TableDesc object
[ https://issues.apache.org/jira/browse/HIVE-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11196: - Attachment: HIVE-11196.1.patch cc-ing [~jpullokkaran] for review. Thanks Hari Utilities.getPartitionDesc() should try to reuse TableDesc object -- Key: HIVE-11196 URL: https://issues.apache.org/jira/browse/HIVE-11196 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11196.1.patch Currently, Utilities.getPartitionDesc() creates a new PartitionDesc object which inturn creates new TableDesc object via Utilities.getTableDesc(part.getTable()) for every call. This value needs to be reused so that we can avoid the expense of creating new Descriptor object wherever possible -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11187) LLAP: clean up shuffle directories when DAG completes
[ https://issues.apache.org/jira/browse/HIVE-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617484#comment-14617484 ] Sergey Shelukhin commented on HIVE-11187: - [~gopalv] this is actually already done with a 5-minute delay, as far as I see. I started adding code but it's already there (QueryFileCleaner). Does it not work, or are there some other directories? LLAP: clean up shuffle directories when DAG completes - Key: HIVE-11187 URL: https://issues.apache.org/jira/browse/HIVE-11187 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11016) MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions)
[ https://issues.apache.org/jira/browse/HIVE-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617487#comment-14617487 ] Vikram Dixit K commented on HIVE-11016: --- Nit: Can you remove the warning log in the fetch code. We don't want to do a lot of logging during the process phase. Otherwise LGTM +1. MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions) Key: HIVE-11016 URL: https://issues.apache.org/jira/browse/HIVE-11016 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11016.01.patch, HIVE-11016.patch Didn't spend a lot of time investigating, but from the code it looks like we shouldn't be calling it after false at least on this path (after false from next, pushRecord returns false, which causes fetchDone to be set for the tag; and fetchOneRow is not called if that is set; should be ok unless tags are messed up?) {noformat} 2015-06-15 17:28:17,272 ERROR [main]: SessionState (SessionState.java:printError(984)) - Vertex failed, vertexName=Reducer 2, vertexId=vertex_1434414363282_0002_17_03, diagnostics=[Task failed, taskId=task_1434414363282_0002_17_03_02, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1434414363282_0002_17_03_02_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:338) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:380) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:449) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:651) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:314) ... 15 more Caused by: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404) ... 20 more Caused by: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at
[jira] [Commented] (HIVE-11130) Refactoring the code so that HiveTxnManager interface will support lock/unlock table/database object
[ https://issues.apache.org/jira/browse/HIVE-11130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616823#comment-14616823 ] Aihua Xu commented on HIVE-11130: - Can you take a look at the .2.patch? Moved the check logic to HiveTxnManagerImpl. Refactoring the code so that HiveTxnManager interface will support lock/unlock table/database object Key: HIVE-11130 URL: https://issues.apache.org/jira/browse/HIVE-11130 Project: Hive Issue Type: Sub-task Components: Locking Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11130.2.patch, HIVE-11130.patch This is just a refactoring step which keeps the current logic, but it exposes the explicit lock/unlock table and database in HiveTxnManager which should be implemented differently by the subclasses ( currently it's not. e.g., for ZooKeeper implementation, we should lock table and database when we try to lock the table). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10797) Simplify the test for vectorized input
[ https://issues.apache.org/jira/browse/HIVE-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-10797. -- Resolution: Won't Fix I think we can ignore the acid stuff for now. Simplify the test for vectorized input -- Key: HIVE-10797 URL: https://issues.apache.org/jira/browse/HIVE-10797 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley The call to Utilities.isVectorMode should be simplified for the readers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11124) Move OrcRecordUpdater.getAcidEventFields to RecordReaderFactory
[ https://issues.apache.org/jira/browse/HIVE-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617607#comment-14617607 ] Hive QA commented on HIVE-11124: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12742057/HIVE-11124.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4524/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4524/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4524/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spark-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[jira] [Commented] (HIVE-11153) LLAP: SIGSEGV in Off-heap decompression routines
[ https://issues.apache.org/jira/browse/HIVE-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617659#comment-14617659 ] Sergey Shelukhin commented on HIVE-11153: - That is probably https://issues.apache.org/jira/browse/HADOOP-10027 LLAP: SIGSEGV in Off-heap decompression routines Key: HIVE-11153 URL: https://issues.apache.org/jira/browse/HIVE-11153 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: llap Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: llap-cn105-coredump.log LLAP started with {code} ./dist/hive/bin/hive --service llap --cache 57344m --executors 16 --size 131072m --xmx 65536m --name llap0 --loglevel WARN --instances 1 {code} Running date_dim filters from query27 with the large cache enabled. {code} R13=0x7f2ca9d15ca0 is pointing into the stack for thread: 0x7f2d4cece800 R14=0x7f3d7e2bfc00: offset 0xf9dc00 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 R15=0x7f3d7e2bb6a0: offset 0xf996a0 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 Stack: [0x7f2ca9c17000,0x7f2ca9d18000], sp=0x7f2ca9d15ca0, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6daca3] jni_GetStaticObjectField+0xc3 C [libhadoop.so.1.0.0+0x100e9] Java_org_apache_hadoop_io_compress_zlib_ZlibDecompressor_inflateBytesDirect+0x49 C 0x7f2ca9d15e60 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect()I+0 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateDirect(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I+93 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor$ZlibDirectDecompressor.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+72 j org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+6 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.directDecompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+15 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+17 j org.apache.hadoop.hive.ql.io.orc.InStream.decompressChunk(Ljava/nio/ByteBuffer;Lorg/apache/hadoop/hive/ql/io/orc/CompressionCodec;Ljava/nio/ByteBuffer;)V+14 j org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(JJLorg/apache/hadoop/hive/common/DiskRangeList;JJLorg/apache/hadoop/hive/shims/HadoopShims$ZeroCopyReaderShim;Lorg/apache/hadoop/hive/ql/io/o rc/CompressionCodec;ILorg/apache/hadoop/hive/llap/io/api/cache/LowLevelCache;Lorg/apache/hadoop/hive/llap/io/api/EncodedColumnBatch$StreamBuffer;JJLorg/apache/hadoop/hive/llap/counters/LowLevelCacheCounte rs;)Lorg/apache/hadoop/hive/common/DiskRangeList;+376 j org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(ILorg/apache/hadoop/hive/ql/io/orc/StripeInformation;[Lorg/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex;Ljava/util/List;Ljava/uti l/List;[Z[[Z)V+2079 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Void;+1244 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Object;+1 j org.apache.hadoop.hive.common.CallableWithNdc.call()Ljava/lang/Object;+8 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub {code} Always reproducible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11197: Attachment: HIVE-11197.patch While extracting join conditions follow Hive rules for type conversion instead of Calcite - Key: HIVE-11197 URL: https://issues.apache.org/jira/browse/HIVE-11197 Project: Hive Issue Type: Bug Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11197.patch, HIVE-11197.patch Calcite strict type system throws exception in those cases, which are legal in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11137) In DateWritable remove the use of LazyBinaryUtils
[ https://issues.apache.org/jira/browse/HIVE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617634#comment-14617634 ] Hive QA commented on HIVE-11137: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744094/HIVE-11137.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4526/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4526/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4526/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spark-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[jira] [Commented] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617652#comment-14617652 ] Sergey Shelukhin commented on HIVE-11013: - Will fix momentarily MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11181) Update people change for new PMC members
[ https://issues.apache.org/jira/browse/HIVE-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617706#comment-14617706 ] Ferdinand Xu commented on HIVE-11181: - LGTM +1 Update people change for new PMC members Key: HIVE-11181 URL: https://issues.apache.org/jira/browse/HIVE-11181 Project: Hive Issue Type: Task Components: Website Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-11181.patch As suggested in the title. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11016) MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions)
[ https://issues.apache.org/jira/browse/HIVE-11016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617716#comment-14617716 ] Sergey Shelukhin commented on HIVE-11016: - Warning is only emitted if something is wrong, so I think it's reasonable to keep it MiniTez mergejoin test fails with Tez input error (issue in merge join under certain conditions) Key: HIVE-11016 URL: https://issues.apache.org/jira/browse/HIVE-11016 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11016.01.patch, HIVE-11016.patch Didn't spend a lot of time investigating, but from the code it looks like we shouldn't be calling it after false at least on this path (after false from next, pushRecord returns false, which causes fetchDone to be set for the tag; and fetchOneRow is not called if that is set; should be ok unless tags are messed up?) {noformat} 2015-06-15 17:28:17,272 ERROR [main]: SessionState (SessionState.java:printError(984)) - Vertex failed, vertexName=Reducer 2, vertexId=vertex_1434414363282_0002_17_03, diagnostics=[Task failed, taskId=task_1434414363282_0002_17_03_02, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1434414363282_0002_17_03_02_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:181) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:338) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:380) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:449) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:651) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:314) ... 15 more Caused by: java.lang.RuntimeException: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404) ... 20 more Caused by: java.io.IOException: Please check if you are invoking moveToNext() even after it returned false. at org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:223) at
[jira] [Resolved] (HIVE-10798) Remove dependence on VectorizedBatchUtil from VectorizedOrcAcidRowReader
[ https://issues.apache.org/jira/browse/HIVE-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-10798. -- Resolution: Won't Fix Release Note: Since the InputFormat stuff will stay in Hive, I think we can ignore this for now. Remove dependence on VectorizedBatchUtil from VectorizedOrcAcidRowReader Key: HIVE-10798 URL: https://issues.apache.org/jira/browse/HIVE-10798 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley VectorizedBatchUtil has a lot of dependences that Orc should avoid and the code should be refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11137) In DateWritable remove the use of LazyBinaryUtils
[ https://issues.apache.org/jira/browse/HIVE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617592#comment-14617592 ] Owen O'Malley commented on HIVE-11137: -- I'm confused. Do you have a patch for this? In DateWritable remove the use of LazyBinaryUtils - Key: HIVE-11137 URL: https://issues.apache.org/jira/browse/HIVE-11137 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Currently the DateWritable class uses LazyBinaryUtils, which has a lot of dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10799) Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc
[ https://issues.apache.org/jira/browse/HIVE-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617591#comment-14617591 ] Hive QA commented on HIVE-10799: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744093/HIVE-10799.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4523/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4523/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4523/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4523/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 63b31c9 HIVE-11013 : MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) (Sergey Shelukhin, reviewed by Vikram Dixit K) + git clean -f -d Removing common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsConstant.java Removing common/src/java/org/apache/hadoop/hive/common/metrics/common/MetricsVariable.java + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 63b31c9 HIVE-11013 : MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) (Sergey Shelukhin, reviewed by Vikram Dixit K) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12744093 - PreCommit-HIVE-TRUNK-Build Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc -- Key: HIVE-10799 URL: https://issues.apache.org/jira/browse/HIVE-10799 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-10799.patch SearchArgumentFactory and SearchArgumentImpl are high level and shouldn't depend on the internals of Hive's AST model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11137) In DateWritable remove the use of LazyBinaryUtils
[ https://issues.apache.org/jira/browse/HIVE-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-11137: - Attachment: HIVE-11137.patch Here is my approach. In DateWritable remove the use of LazyBinaryUtils - Key: HIVE-11137 URL: https://issues.apache.org/jira/browse/HIVE-11137 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-11137.patch Currently the DateWritable class uses LazyBinaryUtils, which has a lot of dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617647#comment-14617647 ] Szehon Ho commented on HIVE-11013: -- Hi, this is causing compilation failure, we should have waited for precommit tests to be run before commit. The method is defined twice for the class. Do you want to make a quick fix, or we should revert the change? MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617670#comment-14617670 ] Szehon Ho commented on HIVE-11013: -- Thanks. MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11131) Get row information on DataWritableWriter once for better writing performance
[ https://issues.apache.org/jira/browse/HIVE-11131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617713#comment-14617713 ] Ferdinand Xu commented on HIVE-11131: - LGTM +1 Get row information on DataWritableWriter once for better writing performance - Key: HIVE-11131 URL: https://issues.apache.org/jira/browse/HIVE-11131 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11131.2.patch, HIVE-11131.3.patch, HIVE-11131.4.patch DataWritableWriter is a class used to write Hive records to Parquet files. This class is getting all the information about how to parse a record, such as schema and object inspector, every time a record is written (or write() is called). We can make this class perform better by initializing some writers per data type once, and saving all object inspectors on each writer. The class expects that the next records written will have the same object inspectors and schema, so there is no need to have conditions for that. When a new schema is written, DataWritableWriter is created again by Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617581#comment-14617581 ] Hive QA commented on HIVE-10927: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744077/HIVE-10927.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4522/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4522/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4522/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/ql/parse/HiveLexer.g
[jira] [Updated] (HIVE-10799) Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc
[ https://issues.apache.org/jira/browse/HIVE-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-10799: - Attachment: HIVE-10799.patch This patch: * splits SearchArgumentFactory into ConvertAstToSearchArg * ConvertAstToSearchArg takes the Hive AST and uses the SearchArgument builder to make the SearchArgument * We normalize the PredicateLeaf argument types. * Add a bunch of test cases for ConvertAstToSearchArg * The SearchArgument builder now requires the type for each predicate leaf. Refactor the SearchArgumentFactory to remove the dependence on ExprNodeGenericFuncDesc -- Key: HIVE-10799 URL: https://issues.apache.org/jira/browse/HIVE-10799 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-10799.patch SearchArgumentFactory and SearchArgumentImpl are high level and shouldn't depend on the internals of Hive's AST model. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11197: Attachment: HIVE-11197.patch While extracting join conditions follow Hive rules for type conversion instead of Calcite - Key: HIVE-11197 URL: https://issues.apache.org/jira/browse/HIVE-11197 Project: Hive Issue Type: Bug Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11197.patch Calcite strict type system throws exception in those cases, which are legal in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11072) Add data validation between Hive metastore upgrades tests
[ https://issues.apache.org/jira/browse/HIVE-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11072: --- Assignee: Naveen Gangam (was: Sergio Peña) Add data validation between Hive metastore upgrades tests - Key: HIVE-11072 URL: https://issues.apache.org/jira/browse/HIVE-11072 Project: Hive Issue Type: Task Components: Tests Reporter: Sergio Peña Assignee: Naveen Gangam An existing Hive metastore upgrade test is running on Hive jenkins. However, these scripts do test only database schema upgrade, not data validation between upgrades. We should validate data between metastore version upgrades. Using data validation, we may ensure that data won't be damaged, or corrupted when upgrading the Hive metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7741) Don't synchronize WriterImpl.addRow() when dynamic.partition is enabled
[ https://issues.apache.org/jira/browse/HIVE-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-7741. - Resolution: Duplicate Fixed by HIVE-10191. Don't synchronize WriterImpl.addRow() when dynamic.partition is enabled --- Key: HIVE-7741 URL: https://issues.apache.org/jira/browse/HIVE-7741 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Environment: Loading into orc Reporter: Mostafa Mokhtar Assignee: Prasanth Jayachandran Labels: performance When loading into an un-paritioned ORC table WriterImpl$StructTreeWriter.write method is synchronized. When hive.optimize.sort.dynamic.partition is enabled the current thread will be the only writer and the synchronization is not needed. Also checking for memory per row is an over kill , this can be done per 1K rows or such {code} public void addRow(Object row) throws IOException { synchronized (this) { treeWriter.write(row); rowsInStripe += 1; if (buildIndex) { rowsInIndex += 1; if (rowsInIndex = rowIndexStride) { createRowIndexEntry(); } } } memoryManager.addedRow(); } {code} This can improve ORC load performance by 7% {code} Stack Trace Sample CountPercentage(%) WriterImpl.addRow(Object) 5,852 65.782 WriterImpl$StructTreeWriter.write(Object) 5,163 58.037 MemoryManager.addedRow() 666 7.487 MemoryManager.notifyWriters() 648 7.284 WriterImpl.checkMemory(double) 645 7.25 WriterImpl.flushStripe() 643 7.228 WriterImpl$StructTreeWriter.writeStripe(OrcProto$StripeFooter$Builder, int) 584 6.565 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617624#comment-14617624 ] Hive QA commented on HIVE-11197: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744096/HIVE-11197.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4525/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4525/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4525/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spark-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[jira] [Commented] (HIVE-11153) LLAP: SIGSEGV in Off-heap decompression routines
[ https://issues.apache.org/jira/browse/HIVE-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617640#comment-14617640 ] Sergey Shelukhin commented on HIVE-11153: - Looking at the source, it's this line that produces the error: {noformat} jobject clazz = (*env)-GetStaticObjectField(env, this, ZlibDecompressor_clazz); {noformat} I don't think this is related to buffers... probably some issue between native libs and Hadoop. Also some other people report similar issue: http://stackoverflow.com/questions/27807444/java-fatal-error-when-using-hadoop-snappy LLAP: SIGSEGV in Off-heap decompression routines Key: HIVE-11153 URL: https://issues.apache.org/jira/browse/HIVE-11153 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: llap Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: llap-cn105-coredump.log LLAP started with {code} ./dist/hive/bin/hive --service llap --cache 57344m --executors 16 --size 131072m --xmx 65536m --name llap0 --loglevel WARN --instances 1 {code} Running date_dim filters from query27 with the large cache enabled. {code} R13=0x7f2ca9d15ca0 is pointing into the stack for thread: 0x7f2d4cece800 R14=0x7f3d7e2bfc00: offset 0xf9dc00 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 R15=0x7f3d7e2bb6a0: offset 0xf996a0 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 Stack: [0x7f2ca9c17000,0x7f2ca9d18000], sp=0x7f2ca9d15ca0, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6daca3] jni_GetStaticObjectField+0xc3 C [libhadoop.so.1.0.0+0x100e9] Java_org_apache_hadoop_io_compress_zlib_ZlibDecompressor_inflateBytesDirect+0x49 C 0x7f2ca9d15e60 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect()I+0 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateDirect(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I+93 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor$ZlibDirectDecompressor.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+72 j org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+6 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.directDecompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+15 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+17 j org.apache.hadoop.hive.ql.io.orc.InStream.decompressChunk(Ljava/nio/ByteBuffer;Lorg/apache/hadoop/hive/ql/io/orc/CompressionCodec;Ljava/nio/ByteBuffer;)V+14 j org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(JJLorg/apache/hadoop/hive/common/DiskRangeList;JJLorg/apache/hadoop/hive/shims/HadoopShims$ZeroCopyReaderShim;Lorg/apache/hadoop/hive/ql/io/o rc/CompressionCodec;ILorg/apache/hadoop/hive/llap/io/api/cache/LowLevelCache;Lorg/apache/hadoop/hive/llap/io/api/EncodedColumnBatch$StreamBuffer;JJLorg/apache/hadoop/hive/llap/counters/LowLevelCacheCounte rs;)Lorg/apache/hadoop/hive/common/DiskRangeList;+376 j org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(ILorg/apache/hadoop/hive/ql/io/orc/StripeInformation;[Lorg/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex;Ljava/util/List;Ljava/uti l/List;[Z[[Z)V+2079 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Void;+1244 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Object;+1 j org.apache.hadoop.hive.common.CallableWithNdc.call()Ljava/lang/Object;+8 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub {code} Always reproducible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11153) LLAP: SIGSEGV in Off-heap decompression routines
[ https://issues.apache.org/jira/browse/HIVE-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617659#comment-14617659 ] Sergey Shelukhin edited comment on HIVE-11153 at 7/7/15 11:37 PM: -- That is probably HADOOP-10027 was (Author: sershe): That is probably https://issues.apache.org/jira/browse/HADOOP-10027 LLAP: SIGSEGV in Off-heap decompression routines Key: HIVE-11153 URL: https://issues.apache.org/jira/browse/HIVE-11153 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: llap Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: llap-cn105-coredump.log LLAP started with {code} ./dist/hive/bin/hive --service llap --cache 57344m --executors 16 --size 131072m --xmx 65536m --name llap0 --loglevel WARN --instances 1 {code} Running date_dim filters from query27 with the large cache enabled. {code} R13=0x7f2ca9d15ca0 is pointing into the stack for thread: 0x7f2d4cece800 R14=0x7f3d7e2bfc00: offset 0xf9dc00 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 R15=0x7f3d7e2bb6a0: offset 0xf996a0 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 Stack: [0x7f2ca9c17000,0x7f2ca9d18000], sp=0x7f2ca9d15ca0, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6daca3] jni_GetStaticObjectField+0xc3 C [libhadoop.so.1.0.0+0x100e9] Java_org_apache_hadoop_io_compress_zlib_ZlibDecompressor_inflateBytesDirect+0x49 C 0x7f2ca9d15e60 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect()I+0 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateDirect(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I+93 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor$ZlibDirectDecompressor.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+72 j org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+6 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.directDecompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+15 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+17 j org.apache.hadoop.hive.ql.io.orc.InStream.decompressChunk(Ljava/nio/ByteBuffer;Lorg/apache/hadoop/hive/ql/io/orc/CompressionCodec;Ljava/nio/ByteBuffer;)V+14 j org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(JJLorg/apache/hadoop/hive/common/DiskRangeList;JJLorg/apache/hadoop/hive/shims/HadoopShims$ZeroCopyReaderShim;Lorg/apache/hadoop/hive/ql/io/o rc/CompressionCodec;ILorg/apache/hadoop/hive/llap/io/api/cache/LowLevelCache;Lorg/apache/hadoop/hive/llap/io/api/EncodedColumnBatch$StreamBuffer;JJLorg/apache/hadoop/hive/llap/counters/LowLevelCacheCounte rs;)Lorg/apache/hadoop/hive/common/DiskRangeList;+376 j org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(ILorg/apache/hadoop/hive/ql/io/orc/StripeInformation;[Lorg/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex;Ljava/util/List;Ljava/uti l/List;[Z[[Z)V+2079 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Void;+1244 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Object;+1 j org.apache.hadoop.hive.common.CallableWithNdc.call()Ljava/lang/Object;+8 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub {code} Always reproducible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11013) MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?)
[ https://issues.apache.org/jira/browse/HIVE-11013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617657#comment-14617657 ] Sergey Shelukhin commented on HIVE-11013: - Fixed MiniTez tez_join_hash test on the branch fails with NPE (initializeOp not called?) -- Key: HIVE-11013 URL: https://issues.apache.org/jira/browse/HIVE-11013 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-11013.01.patch, HIVE-11013.patch Line numbers are shifted due to logging; the NPE is at {noformat} hashMapRowGetters = new ReusableGetAdaptor[mapJoinTables.length]; {noformat} So looks like mapJoinTables is null. I added logging to see if they could be set to null from cache, but that doesn't seem to be the case. Looks like initializeOp is not called. {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:428) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:315) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:278) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:271) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:361) ... 17 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:339) ... 29 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10927: - Attachment: HIVE-10927.2.patch Build was broken, uploading same patch again. Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11171) Join reordering algorithm might introduce projects between joins
[ https://issues.apache.org/jira/browse/HIVE-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616572#comment-14616572 ] Jesus Camacho Rodriguez commented on HIVE-11171: Changes in q files LGTM. All of them are join input swaps, resulting in removal of Select operator on top of the join. Further, in some cases multijoin merge is triggered where it was not triggered before e.g. join_merging.q. Join reordering algorithm might introduce projects between joins Key: HIVE-11171 URL: https://issues.apache.org/jira/browse/HIVE-11171 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11171.01.patch, HIVE-11171.02.patch, HIVE-11171.03.patch, HIVE-11171.5.patch, HIVE-11171.patch, HIVE-11171.patch Join reordering algorithm might introduce projects between joins which causes multijoin optimization in SemanticAnalyzer to not kick in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11171) Join reordering algorithm might introduce projects between joins
[ https://issues.apache.org/jira/browse/HIVE-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616573#comment-14616573 ] Jesus Camacho Rodriguez commented on HIVE-11171: Thanks [~ashutoshc]! Join reordering algorithm might introduce projects between joins Key: HIVE-11171 URL: https://issues.apache.org/jira/browse/HIVE-11171 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11171.01.patch, HIVE-11171.02.patch, HIVE-11171.03.patch, HIVE-11171.5.patch, HIVE-11171.patch, HIVE-11171.patch Join reordering algorithm might introduce projects between joins which causes multijoin optimization in SemanticAnalyzer to not kick in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10931) Wrong columns selected on multiple joins
[ https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616478#comment-14616478 ] Furcy Pin commented on HIVE-10931: -- I tried to reproduce on a local environment on the release-1.2.1 branch, and the bug has disappeared. I guess the issue can be closed. Wrong columns selected on multiple joins Key: HIVE-10931 URL: https://issues.apache.org/jira/browse/HIVE-10931 Project: Hive Issue Type: Bug Affects Versions: 1.1.0 Environment: Cloudera cdh5.4.2 Reporter: Furcy Pin The following set of queries : {code:sql} DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; DROP TABLE IF EXISTS test3 ; CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,A) ; CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,X) ; CREATE TABLE test3 (coL1 STRING) ; INSERT INTO TABLE test3 VALUES (A) ; SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} will return this : {noformat} +--+--+ | t2.val | +--+--+ | A| +--+--+ {noformat} Obviously, this result is wrong as table `test2` contains a X and no A. This is the most minimal example we found of this issue, in particular having less than 6 columns in the tables will work, for instance : {code:sql} SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} (same query as before, but `col5` was removed from the select) will return : {noformat} +--+--+ | t2.val | +--+--+ | X| +--+--+ {noformat} Removing the `COALESCE` also removes the bug... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11171) Join reordering algorithm might introduce projects between joins
[ https://issues.apache.org/jira/browse/HIVE-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11171: Attachment: HIVE-11171.5.patch Updated golden files. Join reordering algorithm might introduce projects between joins Key: HIVE-11171 URL: https://issues.apache.org/jira/browse/HIVE-11171 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11171.01.patch, HIVE-11171.02.patch, HIVE-11171.03.patch, HIVE-11171.5.patch, HIVE-11171.patch, HIVE-11171.patch Join reordering algorithm might introduce projects between joins which causes multijoin optimization in SemanticAnalyzer to not kick in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10931) Wrong columns selected on multiple joins
[ https://issues.apache.org/jira/browse/HIVE-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Furcy Pin updated HIVE-10931: - Fix Version/s: 1.2.1 Wrong columns selected on multiple joins Key: HIVE-10931 URL: https://issues.apache.org/jira/browse/HIVE-10931 Project: Hive Issue Type: Bug Affects Versions: 1.1.0 Environment: Cloudera cdh5.4.2 Reporter: Furcy Pin Fix For: 1.2.1 The following set of queries : {code:sql} DROP TABLE IF EXISTS test1 ; DROP TABLE IF EXISTS test2 ; DROP TABLE IF EXISTS test3 ; CREATE TABLE test1 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test1 VALUES (1,NULL,NULL,NULL,NULL,A) ; CREATE TABLE test2 (col1 INT, col2 STRING, col3 STRING, coL4 STRING, coL5 STRING, col6 STRING) ; INSERT INTO TABLE test2 VALUES (1,NULL,NULL,NULL,NULL,X) ; CREATE TABLE test3 (coL1 STRING) ; INSERT INTO TABLE test3 VALUES (A) ; SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, col5, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} will return this : {noformat} +--+--+ | t2.val | +--+--+ | A| +--+--+ {noformat} Obviously, this result is wrong as table `test2` contains a X and no A. This is the most minimal example we found of this issue, in particular having less than 6 columns in the tables will work, for instance : {code:sql} SELECT T2.val FROM test1 T1 LEFT JOIN (SELECT col1, col2, col3, col4, COALESCE(col6,) as val FROM test2) T2 ON T2.col1 = T1.col1 LEFT JOIN test3 T3 ON T3.col1 = T1.col6 ; {code} (same query as before, but `col5` was removed from the select) will return : {noformat} +--+--+ | t2.val | +--+--+ | X| +--+--+ {noformat} Removing the `COALESCE` also removes the bug... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11153) LLAP: SIGSEGV in Off-heap decompression routines
[ https://issues.apache.org/jira/browse/HIVE-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-11153. - Resolution: Not A Problem LLAP: SIGSEGV in Off-heap decompression routines Key: HIVE-11153 URL: https://issues.apache.org/jira/browse/HIVE-11153 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: llap Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: llap-cn105-coredump.log LLAP started with {code} ./dist/hive/bin/hive --service llap --cache 57344m --executors 16 --size 131072m --xmx 65536m --name llap0 --loglevel WARN --instances 1 {code} Running date_dim filters from query27 with the large cache enabled. {code} R13=0x7f2ca9d15ca0 is pointing into the stack for thread: 0x7f2d4cece800 R14=0x7f3d7e2bfc00: offset 0xf9dc00 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 R15=0x7f3d7e2bb6a0: offset 0xf996a0 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 Stack: [0x7f2ca9c17000,0x7f2ca9d18000], sp=0x7f2ca9d15ca0, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6daca3] jni_GetStaticObjectField+0xc3 C [libhadoop.so.1.0.0+0x100e9] Java_org_apache_hadoop_io_compress_zlib_ZlibDecompressor_inflateBytesDirect+0x49 C 0x7f2ca9d15e60 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect()I+0 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateDirect(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I+93 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor$ZlibDirectDecompressor.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+72 j org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+6 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.directDecompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+15 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+17 j org.apache.hadoop.hive.ql.io.orc.InStream.decompressChunk(Ljava/nio/ByteBuffer;Lorg/apache/hadoop/hive/ql/io/orc/CompressionCodec;Ljava/nio/ByteBuffer;)V+14 j org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(JJLorg/apache/hadoop/hive/common/DiskRangeList;JJLorg/apache/hadoop/hive/shims/HadoopShims$ZeroCopyReaderShim;Lorg/apache/hadoop/hive/ql/io/o rc/CompressionCodec;ILorg/apache/hadoop/hive/llap/io/api/cache/LowLevelCache;Lorg/apache/hadoop/hive/llap/io/api/EncodedColumnBatch$StreamBuffer;JJLorg/apache/hadoop/hive/llap/counters/LowLevelCacheCounte rs;)Lorg/apache/hadoop/hive/common/DiskRangeList;+376 j org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(ILorg/apache/hadoop/hive/ql/io/orc/StripeInformation;[Lorg/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex;Ljava/util/List;Ljava/uti l/List;[Z[[Z)V+2079 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Void;+1244 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Object;+1 j org.apache.hadoop.hive.common.CallableWithNdc.call()Ljava/lang/Object;+8 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub {code} Always reproducible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10997) LLAP: make sure tests pass #2
[ https://issues.apache.org/jira/browse/HIVE-10997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10997: Attachment: HIVE-10997.02.patch Another attempt after bunch of fixes.. hopefully HiveQA will work :) LLAP: make sure tests pass #2 - Key: HIVE-10997 URL: https://issues.apache.org/jira/browse/HIVE-10997 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10997.01.patch, HIVE-10997.02.patch, HIVE-10997.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11199) LLAP: merge master into branch
[ https://issues.apache.org/jira/browse/HIVE-11199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-11199. - Resolution: Fixed LLAP: merge master into branch -- Key: HIVE-11199 URL: https://issues.apache.org/jira/browse/HIVE-11199 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617840#comment-14617840 ] Hive QA commented on HIVE-10927: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744102/HIVE-10927.2.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9138 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_merge_multi_expressions org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_merge_multi_expressions org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_louter_join_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_outer_join_ppr {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4527/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4527/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4527/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12744102 - PreCommit-HIVE-TRUNK-Build Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11197) While extracting join conditions follow Hive rules for type conversion instead of Calcite
[ https://issues.apache.org/jira/browse/HIVE-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617844#comment-14617844 ] Hive QA commented on HIVE-11197: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12744103/HIVE-11197.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4528/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4528/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4528/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java
[jira] [Commented] (HIVE-11198) Fix load data query file format check for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617783#comment-14617783 ] Prasanth Jayachandran commented on HIVE-11198: -- [~sushanth] Can you take a look at this patch? Fix load data query file format check for partitioned tables Key: HIVE-11198 URL: https://issues.apache.org/jira/browse/HIVE-11198 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11198.patch HIVE-8 added file format check for ORC format. The check will throw exception when non ORC formats is loaded to ORC managed table. But it does not work for partitioned table. Partitioned tables are allowed to have some partitions with different file format. See this discussion for more details https://issues.apache.org/jira/browse/HIVE-8?focusedCommentId=14617271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14617271 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10927) Add number of HMS/HS2 connection metrics
[ https://issues.apache.org/jira/browse/HIVE-10927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617860#comment-14617860 ] Szehon Ho commented on HIVE-10927: -- [~jxiang] the test failures are not related, can you take a look at the new review? Thanks Add number of HMS/HS2 connection metrics Key: HIVE-10927 URL: https://issues.apache.org/jira/browse/HIVE-10927 Project: Hive Issue Type: Sub-task Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10927.2.patch, HIVE-10927.2.patch, HIVE-10927.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11153) LLAP: SIGSEGV in Off-heap decompression routines
[ https://issues.apache.org/jira/browse/HIVE-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617866#comment-14617866 ] Gopal V commented on HIVE-11153: Yes, the cluster where the queries succeed are running hadoop-2.8.0-SNAPSHOT builds. LLAP: SIGSEGV in Off-heap decompression routines Key: HIVE-11153 URL: https://issues.apache.org/jira/browse/HIVE-11153 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: llap Reporter: Gopal V Assignee: Sergey Shelukhin Attachments: llap-cn105-coredump.log LLAP started with {code} ./dist/hive/bin/hive --service llap --cache 57344m --executors 16 --size 131072m --xmx 65536m --name llap0 --loglevel WARN --instances 1 {code} Running date_dim filters from query27 with the large cache enabled. {code} R13=0x7f2ca9d15ca0 is pointing into the stack for thread: 0x7f2d4cece800 R14=0x7f3d7e2bfc00: offset 0xf9dc00 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 R15=0x7f3d7e2bb6a0: offset 0xf996a0 in /usr/jdk64/jdk1.8.0_40/jre/lib/amd64/server/libjvm.so at 0x7f3d7d322000 Stack: [0x7f2ca9c17000,0x7f2ca9d18000], sp=0x7f2ca9d15ca0, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x6daca3] jni_GetStaticObjectField+0xc3 C [libhadoop.so.1.0.0+0x100e9] Java_org_apache_hadoop_io_compress_zlib_ZlibDecompressor_inflateBytesDirect+0x49 C 0x7f2ca9d15e60 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateBytesDirect()I+0 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor.inflateDirect(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)I+93 j org.apache.hadoop.io.compress.zlib.ZlibDecompressor$ZlibDirectDecompressor.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+72 j org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+6 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.directDecompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+15 j org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(Ljava/nio/ByteBuffer;Ljava/nio/ByteBuffer;)V+17 j org.apache.hadoop.hive.ql.io.orc.InStream.decompressChunk(Ljava/nio/ByteBuffer;Lorg/apache/hadoop/hive/ql/io/orc/CompressionCodec;Ljava/nio/ByteBuffer;)V+14 j org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(JJLorg/apache/hadoop/hive/common/DiskRangeList;JJLorg/apache/hadoop/hive/shims/HadoopShims$ZeroCopyReaderShim;Lorg/apache/hadoop/hive/ql/io/o rc/CompressionCodec;ILorg/apache/hadoop/hive/llap/io/api/cache/LowLevelCache;Lorg/apache/hadoop/hive/llap/io/api/EncodedColumnBatch$StreamBuffer;JJLorg/apache/hadoop/hive/llap/counters/LowLevelCacheCounte rs;)Lorg/apache/hadoop/hive/common/DiskRangeList;+376 j org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(ILorg/apache/hadoop/hive/ql/io/orc/StripeInformation;[Lorg/apache/hadoop/hive/ql/io/orc/OrcProto$RowIndex;Ljava/util/List;Ljava/uti l/List;[Z[[Z)V+2079 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Void;+1244 j org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal()Ljava/lang/Object;+1 j org.apache.hadoop.hive.common.CallableWithNdc.call()Ljava/lang/Object;+8 j java.util.concurrent.FutureTask.run()V+42 j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub {code} Always reproducible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11053) Add more tests for HIVE-10844[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GaoLun updated HIVE-11053: -- Attachment: HIVE-11053.5-spark.patch Add more tests for HIVE-10844[Spark Branch] --- Key: HIVE-11053 URL: https://issues.apache.org/jira/browse/HIVE-11053 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chengxiang Li Assignee: GaoLun Priority: Minor Attachments: HIVE-11053.1-spark.patch, HIVE-11053.2-spark.patch, HIVE-11053.3-spark.patch, HIVE-11053.4-spark.patch, HIVE-11053.5-spark.patch Add some test cases for self union, self-join, CWE, and repeated sub-queries to verify the job of combining quivalent works in HIVE-10844. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10535) LLAP: Cleanup map join cache when a query completes
[ https://issues.apache.org/jira/browse/HIVE-10535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617744#comment-14617744 ] Sergey Shelukhin commented on HIVE-10535: - Grrr, this doesn't work because the execution code just passes nulls everywhere for queryId. Great. LLAP: Cleanup map join cache when a query completes --- Key: HIVE-10535 URL: https://issues.apache.org/jira/browse/HIVE-10535 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-10533.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11053) Add more tests for HIVE-10844[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GaoLun updated HIVE-11053: -- Attachment: HIVE-11053.4-spark.patch Add more tests for HIVE-10844[Spark Branch] --- Key: HIVE-11053 URL: https://issues.apache.org/jira/browse/HIVE-11053 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chengxiang Li Assignee: GaoLun Priority: Minor Attachments: HIVE-11053.1-spark.patch, HIVE-11053.2-spark.patch, HIVE-11053.3-spark.patch, HIVE-11053.4-spark.patch Add some test cases for self union, self-join, CWE, and repeated sub-queries to verify the job of combining quivalent works in HIVE-10844. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11181) Update people change for new PMC members
[ https://issues.apache.org/jira/browse/HIVE-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved HIVE-11181. - Resolution: Fixed Committed. Thanks Ferdinand for the review. Update people change for new PMC members Key: HIVE-11181 URL: https://issues.apache.org/jira/browse/HIVE-11181 Project: Hive Issue Type: Task Components: Website Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-11181.patch As suggested in the title. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10755: Attachment: HIVE-10755.patch Rework on HIVE-5193 to enhance the column oriented table acess -- Key: HIVE-10755 URL: https://issues.apache.org/jira/browse/HIVE-10755 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10755.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11132) Queries using join and group by produce incorrect output when hive.auto.convert.join=false and hive.optimize.reducededuplication=true
[ https://issues.apache.org/jira/browse/HIVE-11132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rich Haase resolved HIVE-11132. --- Resolution: Won't Fix Assignee: Rich Haase The interaction between these two parameters is undesirable, but rare enough that it's probably not worth the effort of fixing. This JIRA can serve as documentation of the problem for anyone who encounters it in future. Queries using join and group by produce incorrect output when hive.auto.convert.join=false and hive.optimize.reducededuplication=true - Key: HIVE-11132 URL: https://issues.apache.org/jira/browse/HIVE-11132 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Rich Haase Assignee: Rich Haase Queries using join and group by produce multiple output rows with the same key when hive.auto.convert.join=false and hive.optimize.reducededuplication=true. This interaction between configuration parameters is unexpected and should be well documented at the very least and should likely be considered a bug. e.g. hive set hive.auto.convert.join = false; hive set hive.optimize.reducededuplication = true; hive SELECT foo.id, count(*) as factor FROM foo JOIN bar ON (foo.id = bar.id and foo.line_id = bar.line_id) JOIN split ON (foo.id = split.id and foo.line_id = split.line_id) JOIN forecast ON (foo.id = forecast.id AND foo.line_id = forecast.line_id) WHERE foo.order != ‘blah’ AND foo.id = ‘XYZ' GROUP BY foo.id; XYZ 79 XYZ 74 XYZ 297 XYZ 66 hive set hive.auto.convert.join = true; hive set hive.optimize.reducededuplication = true; hive SELECT foo.id, count(*) as factor FROM foo JOIN bar ON (foo.id = bar.id and foo.line_id = bar.line_id) JOIN split ON (foo.id = split.id and foo.line_id = split.line_id) JOIN forecast ON (foo.id = forecast.id AND foo.line_id = forecast.line_id) WHERE foo.order != ‘blah’ AND foo.id = ‘XYZ' GROUP BY foo.id; XYZ 516 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10755: Fix Version/s: 2.0.0 Rework on HIVE-5193 to enhance the column oriented table acess -- Key: HIVE-10755 URL: https://issues.apache.org/jira/browse/HIVE-10755 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-10755.patch Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.2.0, the patch posted by Viray didn't work, probably due to some jar reference. That seems to get fixed and that patch works in 2.0 now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10895) ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources
[ https://issues.apache.org/jira/browse/HIVE-10895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1461#comment-1461 ] Aihua Xu commented on HIVE-10895: - Thanks [~vgumashta] for doing the testing. ObjectStore does not close Query objects in some calls, causing a potential leak in some metastore db resources --- Key: HIVE-10895 URL: https://issues.apache.org/jira/browse/HIVE-10895 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Takahiko Saito Assignee: Aihua Xu Attachments: HIVE-10895.1.patch, HIVE-10895.2.patch, HIVE-10895.3.patch During testing, we've noticed Oracle db running out of cursors. Might be related to this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10791) Beeline-CLI: Implement in-place update UI for CLI compatibility
[ https://issues.apache.org/jira/browse/HIVE-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616678#comment-14616678 ] Xuefu Zhang commented on HIVE-10791: [~Ferd], I think [~gopalv] meant the job status tracking shown by Hive CLI. Refer to HIVE-8495. [~gopalv], did you meant that HIVE-8495 was only implemented for Hive CLI? If so, don't you think that the feature was incomplete in certain sense and it might be a better idea for the original dev to support that feature for BeeLine as well? I knew of the feature and saw it in Hive CLI, but I'm not sure if the feature is also in BeeLine as it should be. Beeline-CLI: Implement in-place update UI for CLI compatibility --- Key: HIVE-10791 URL: https://issues.apache.org/jira/browse/HIVE-10791 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Gopal V Priority: Critical The current CLI implementation has an in-place updating UI which offers a clear picture of execution runtime and failures. This is designed for large DAGs which have more than 10 verticles, where the old UI would scroll sideways. The new CLI implementation needs to keep up the usability standards set by the old one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10755: Description: Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.2.0, the patch posted by Vij Rework on HIVE-5193 to enhance the column oriented table acess -- Key: HIVE-10755 URL: https://issues.apache.org/jira/browse/HIVE-10755 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10755.patch Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.2.0, the patch posted by Vij -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2903) Numeric binary type keys are not compared properly
[ https://issues.apache.org/jira/browse/HIVE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Svetozar Ivanov updated HIVE-2903: -- Assignee: Navis (was: Nick Dimiduk) Numeric binary type keys are not compared properly -- Key: HIVE-2903 URL: https://issues.apache.org/jira/browse/HIVE-2903 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2903.D2481.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2903.D2481.2.patch In current binary format for numbers, minus values are always greater than plus values, for example. {code} System.our.println(Bytes.compareTo(Bytes.toBytes(-100), Bytes.toBytes(100))); // 255 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10755: Description: Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.3.0, the patch posted by Viray didn't work, probably due to some jar reference. That seems to get fixed and that patch works in 2.0.0 now. was: Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.3.0, the patch posted by Viray didn't work, probably due to some jar reference. That seems to get fixed and that patch works in 2.0 now. Rework on HIVE-5193 to enhance the column oriented table acess -- Key: HIVE-10755 URL: https://issues.apache.org/jira/browse/HIVE-10755 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-10755.patch Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.3.0, the patch posted by Viray didn't work, probably due to some jar reference. That seems to get fixed and that patch works in 2.0.0 now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9545) Build FAILURE with IBM JVM
[ https://issues.apache.org/jira/browse/HIVE-9545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616671#comment-14616671 ] pascal oliva commented on HIVE-9545: Did you plan to integrate this patch ? In which Level ? Build FAILURE with IBM JVM --- Key: HIVE-9545 URL: https://issues.apache.org/jira/browse/HIVE-9545 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Environment: mvn -version Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 2014-08-11T22:58:10+02:00) Maven home: /opt/apache-maven-3.2.3 Java version: 1.7.0, vendor: IBM Corporation Java home: /usr/lib/jvm/ibm-java-x86_64-71/jre Default locale: en_US, platform encoding: ISO-8859-1 OS name: linux, version: 3.10.0-123.4.4.el7.x86_64, arch: amd64, family: unix Reporter: pascal oliva Assignee: Navis Attachments: HIVE-9545.1.patch.txt NO PRECOMMIT TESTS With the use of IBM JVM environment : [root@dorado-vm2 hive]# java -version java version 1.7.0 Java(TM) SE Runtime Environment (build pxa6470_27sr2-20141026_01(SR2)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20141017_217728 (JIT enabled, AOT enabled). The build failed on [INFO] Hive Query Language FAILURE [ 50.053 s] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-exec: Compilation failure: Compilation failure: [ERROR] /home/pascal/hive0.14/hive/ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java:[29,26] package com.sun.management does not exist. HOWTO : #git clone -b branch-0.14 https://github.com/apache/hive.git #cd hive #mvn install -DskipTests -Phadoop-2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2903) Numeric binary type keys are not compared properly
[ https://issues.apache.org/jira/browse/HIVE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Svetozar Ivanov updated HIVE-2903: -- Assignee: Nick Dimiduk (was: Navis) Numeric binary type keys are not compared properly -- Key: HIVE-2903 URL: https://issues.apache.org/jira/browse/HIVE-2903 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Nick Dimiduk Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2903.D2481.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2903.D2481.2.patch In current binary format for numbers, minus values are always greater than plus values, for example. {code} System.our.println(Bytes.compareTo(Bytes.toBytes(-100), Bytes.toBytes(100))); // 255 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)