[jira] [Commented] (HIVE-9738) create SOUNDEX udf
[ https://issues.apache.org/jira/browse/HIVE-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339739#comment-14339739 ] Hive QA commented on HIVE-9738: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701179/HIVE-9738.2.patch {color:green}SUCCESS:{color} +1 7572 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2891/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2891/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2891/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12701179 - PreCommit-HIVE-TRUNK-Build create SOUNDEX udf -- Key: HIVE-9738 URL: https://issues.apache.org/jira/browse/HIVE-9738 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-9738.1.patch, HIVE-9738.2.patch Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes. The American Soundex System The soundex code consist of the first letter of the name followed by three digits. These three digits are determined by dropping the letters a, e, i, o, u, h, w and y and adding three digits from the remaining letters of the name according to the table below. There are only two additional rules. (1) If two or more consecutive letters have the same code, they are coded as one letter. (2) If there are an insufficient numbers of letters to make the three digits, the remaining digits are set to zero. Soundex Table 1 b,f,p,v 2 c,g,j,k,q,s,x,z 3 d, t 4 l 5 m, n 6 r Examples: Miller M460 Peterson P362 Peters P362 Auerbach A612 Uhrbach U612 Moskowitz M232 Moskovitz M213 Implementation: http://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/Soundex.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9807) LLAP: Add event logging for execution elements
[ https://issues.apache.org/jira/browse/HIVE-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-9807: - Attachment: HIVE-9807.1.patch Sample log lines - additional details will be populated in a later patch, when available. {code} Event=FRAGMENT_START, HostName=hw10890, ApplicationId=application_1425008147866_0006, ContainerId=container_1_0006_01_01, DagName=null, VertexName=null, TaskId=-1, TaskAttemptId=-1, SubmitTime=1425020533780 Event=FRAGMENT_END, HostName=hw10890, ApplicationId=application_1425008147866_0006, ContainerId=container_1_0006_01_01, DagName=null, VertexName=null, TaskId=-1, TaskAttemptId=-1, Succeeded=true, StartTime=1425020533779, EndTime=1425020535678 {code} cc [~gopalv] LLAP: Add event logging for execution elements -- Key: HIVE-9807 URL: https://issues.apache.org/jira/browse/HIVE-9807 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: HIVE-9807.1.patch For analysis of runtimes, interleaving etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) Incorrect result set for vectorized left outer join
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339848#comment-14339848 ] Vikram Dixit K commented on HIVE-9743: -- +1 LGTM will commit it shortly. Incorrect result set for vectorized left outer join --- Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch, HIVE-9743.04.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9788) Make double quote optional in tsv/csv/dsv output
[ https://issues.apache.org/jira/browse/HIVE-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339295#comment-14339295 ] Brock Noland commented on HIVE-9788: +1 Make double quote optional in tsv/csv/dsv output Key: HIVE-9788 URL: https://issues.apache.org/jira/browse/HIVE-9788 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-9788.patch Similar to HIVE-7390 some customers would like the double quotes to be optional. So if the data is {{A}} then the output from beeline should be {{A}} which is the same as the Hive CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command
[ https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HIVE-9768: -- Description: Feature request for Hive LLAP to preload table metadata across all running nodes to reduce query latency (this is what Impala does). The design decision behind this in Impala was to avoid the latency overhead of fetching the metadata at query time, since that's an extra database query (or possibly HBase query in future HIVE-9452) that must first be completely fullfilled before the Hive LLAP query even starts to run, which would slow down the response to the user if not pre-loaded. Also, any temporary outage of the metadata layer would affect the speed LLAP layer so pre-loading and caching the metadata adds resilience against this. This pre-loaded metadata also requires a cluster-wide refresh metadata operation, something Impala added later, and now calls INVALIDATE METADATA in it's SQL dialect. I propose using a more intuitive REFRESH METADATA Hive command instead. (Fyi I was in the original trio of Impala SMEs at Cloudera in early 2013) Regards, Hari Sekhon ex-Cloudera http://www.linkedin.com/in/harisekhon was: Feature request for Hive LLAP to preload table metadata across all running nodes to reduce query latency (this is what Impala does). The design decision behind this in Impala was to avoid the latency overhead of fetching the metadata at query time, since that's an extra database query (or possibly HBase query in future HIVE-9452) that must first be completely fullfilled before the Hive LLAP query even starts to run, which would slow down the response to the user if not pre-loaded. Also, any temporary outage of the metadata layer would affect the speed LLAP layer so pre-loading and caching the metadata adds resilience against this. This pre-loaded metadata also requires a cluster-wide refresh metadata operation, something Impala added later, and now calls INVALIDATE METADATA in it's SQL dialect. I propose using a more intuitive REFRESH METADATA Hive command instead. (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013) Regards, Hari Sekhon ex-Cloudera http://www.linkedin.com/in/harisekhon Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command --- Key: HIVE-9768 URL: https://issues.apache.org/jira/browse/HIVE-9768 Project: Hive Issue Type: New Feature Components: HCatalog, Metastore, Query Planning, Query Processor Affects Versions: llap Environment: HDP 2.2 Reporter: Hari Sekhon Feature request for Hive LLAP to preload table metadata across all running nodes to reduce query latency (this is what Impala does). The design decision behind this in Impala was to avoid the latency overhead of fetching the metadata at query time, since that's an extra database query (or possibly HBase query in future HIVE-9452) that must first be completely fullfilled before the Hive LLAP query even starts to run, which would slow down the response to the user if not pre-loaded. Also, any temporary outage of the metadata layer would affect the speed LLAP layer so pre-loading and caching the metadata adds resilience against this. This pre-loaded metadata also requires a cluster-wide refresh metadata operation, something Impala added later, and now calls INVALIDATE METADATA in it's SQL dialect. I propose using a more intuitive REFRESH METADATA Hive command instead. (Fyi I was in the original trio of Impala SMEs at Cloudera in early 2013) Regards, Hari Sekhon ex-Cloudera http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9788) Make double quote optional in tsv/csv/dsv output
[ https://issues.apache.org/jira/browse/HIVE-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338688#comment-14338688 ] Hive QA commented on HIVE-9788: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701092/HIVE-9788.patch {color:green}SUCCESS:{color} +1 7572 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2882/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2882/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2882/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12701092 - PreCommit-HIVE-TRUNK-Build Make double quote optional in tsv/csv/dsv output Key: HIVE-9788 URL: https://issues.apache.org/jira/browse/HIVE-9788 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-9788.patch Similar to HIVE-7390 some customers would like the double quotes to be optional. So if the data is {{A}} then the output from beeline should be {{A}} which is the same as the Hive CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9797) Need update some spark tests for java 8
[ https://issues.apache.org/jira/browse/HIVE-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338699#comment-14338699 ] Brock Noland commented on HIVE-9797: +1 Need update some spark tests for java 8 --- Key: HIVE-9797 URL: https://issues.apache.org/jira/browse/HIVE-9797 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9797.1.patch The following tests fail on a java 8 environment: TestMiniSparkOnYarnCliDriver.list_bucket_dml_10 TestSparkCliDriver.outer_join_ppr TestSparkCliDriver.vector_cast_constant -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9797) Need update some spark tests for java 8
[ https://issues.apache.org/jira/browse/HIVE-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338668#comment-14338668 ] Sergio Peña commented on HIVE-9797: --- Code is on Review Board. Please review. [~brocknoland] Need update some spark tests for java 8 --- Key: HIVE-9797 URL: https://issues.apache.org/jira/browse/HIVE-9797 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9797.1.patch The following tests fail on a java 8 environment: TestMiniSparkOnYarnCliDriver.list_bucket_dml_10 TestSparkCliDriver.outer_join_ppr TestSparkCliDriver.vector_cast_constant -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9788) Make double quote optional in tsv/csv/dsv output
[ https://issues.apache.org/jira/browse/HIVE-9788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9788: --- Attachment: HIVE-9788.patch Make double quote optional in tsv/csv/dsv output Key: HIVE-9788 URL: https://issues.apache.org/jira/browse/HIVE-9788 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-9788.patch Similar to HIVE-7390 some customers would like the double quotes to be optional. So if the data is {{A}} then the output from beeline should be {{A}} which is the same as the Hive CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests
[ https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338330#comment-14338330 ] Hive QA commented on HIVE-9253: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701000/HIVE-9253.5.patch {color:green}SUCCESS:{color} +1 7571 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2880/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2880/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2880/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12701000 - PreCommit-HIVE-TRUNK-Build MetaStore server should support timeout for long running requests - Key: HIVE-9253 URL: https://issues.apache.org/jira/browse/HIVE-9253 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.5.patch, HIVE-9253.patch In the description of HIVE-7195, one issue is that MetaStore client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. The server should support timeout when the request from client runs a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9796) CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9796: -- Fix Version/s: (was: 1.2.0) cbo-branch CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch] -- Key: HIVE-9796 URL: https://issues.apache.org/jira/browse/HIVE-9796 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9785) CBO (Calcite Return Path): Translate Exchange to Hive Op [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9785: -- Fix Version/s: (was: 1.2.0) cbo-branch CBO (Calcite Return Path): Translate Exchange to Hive Op [CBO branch] - Key: HIVE-9785 URL: https://issues.apache.org/jira/browse/HIVE-9785 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9785.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9516) Enable CBO related tests [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-9516: --- Attachment: HIVE-9516.2-spark.patch Enable CBO related tests [Spark Branch] --- Key: HIVE-9516 URL: https://issues.apache.org/jira/browse/HIVE-9516 Project: Hive Issue Type: Sub-task Components: spark-branch Affects Versions: spark-branch Reporter: Chao Assignee: Chinna Rao Lalam Attachments: HIVE-9516.1-spark.patch, HIVE-9516.2-spark.patch In Spark branch we enabled CBO, but hasn't turned on CBO related unit tests. We should do this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9796) CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9796: -- Attachment: HIVE-9796.cbo.patch [~ashutoshc], this patch contains a small fix for the {{HiveJoinAddNotNullRule}}, that should check whether the key field is nullable or not before adding the filter. Thanks CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch] -- Key: HIVE-9796 URL: https://issues.apache.org/jira/browse/HIVE-9796 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9796.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9516) Enable CBO related tests [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-9516: --- Attachment: (was: HIVE-9516.2-spark.patch) Enable CBO related tests [Spark Branch] --- Key: HIVE-9516 URL: https://issues.apache.org/jira/browse/HIVE-9516 Project: Hive Issue Type: Sub-task Components: spark-branch Affects Versions: spark-branch Reporter: Chao Assignee: Chinna Rao Lalam Attachments: HIVE-9516.1-spark.patch In Spark branch we enabled CBO, but hasn't turned on CBO related unit tests. We should do this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9785) CBO (Calcite Return Path): Translate Exchange to Hive Op [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9785: -- Affects Version/s: cbo-branch CBO (Calcite Return Path): Translate Exchange to Hive Op [CBO branch] - Key: HIVE-9785 URL: https://issues.apache.org/jira/browse/HIVE-9785 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9785.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9263) Implement controllable exit code in beeline
[ https://issues.apache.org/jira/browse/HIVE-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-9263: - Assignee: Chaoyu Tang Implement controllable exit code in beeline --- Key: HIVE-9263 URL: https://issues.apache.org/jira/browse/HIVE-9263 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Johndee Burks Assignee: Chaoyu Tang Priority: Minor It would be nice if beeline implemented something like SQLPlus WHENEVER to control exit codes. This would be useful when performing beeline actions through a shell script. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9774) Print yarn application id to console [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam reassigned HIVE-9774: -- Assignee: Chinna Rao Lalam Print yarn application id to console [Spark Branch] --- Key: HIVE-9774 URL: https://issues.apache.org/jira/browse/HIVE-9774 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Chinna Rao Lalam Oozie would like to use beeline to capture the yarn application id of apps so that if a workflow is canceled, the job can be cancelled. When running under MR we print the job id but under spark we do not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9516) Enable CBO related tests [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-9516: --- Attachment: HIVE-9516.2-spark.patch Enable CBO related tests [Spark Branch] --- Key: HIVE-9516 URL: https://issues.apache.org/jira/browse/HIVE-9516 Project: Hive Issue Type: Sub-task Components: spark-branch Affects Versions: spark-branch Reporter: Chao Assignee: Chinna Rao Lalam Attachments: HIVE-9516.1-spark.patch, HIVE-9516.2-spark.patch In Spark branch we enabled CBO, but hasn't turned on CBO related unit tests. We should do this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9781) Utilize spark.kryo.classesToRegister [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-9781: -- Attachment: HIVE-9781.2.patch Utilize spark.kryo.classesToRegister [Spark Branch] --- Key: HIVE-9781 URL: https://issues.apache.org/jira/browse/HIVE-9781 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Attachments: HIVE-9781.1.patch, HIVE-9781.2.patch I noticed in several thread dumps that it appears kyro is serializing the class names associated with our keys and values. Kyro supports pre-registering classes so that you don't have to serialize the class name and spark supports this via the {{spark.kryo.registrator}} property. We should do this so we don't have to serialize class names. {noformat} Thread 12154: (state = BLOCKED) - java.lang.Object.hashCode() @bci=0 (Compiled frame; information may be imprecise) - com.esotericsoftware.kryo.util.ObjectMap.get(java.lang.Object) @bci=1, line=265 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.getRegistration(java.lang.Class) @bci=18, line=61 (Compiled frame) - com.esotericsoftware.kryo.Kryo.getRegistration(java.lang.Class) @bci=20, line=429 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readName(com.esotericsoftware.kryo.io.Input) @bci=242, line=148 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(com.esotericsoftware.kryo.io.Input) @bci=65, line=115 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClass(com.esotericsoftware.kryo.io.Input) @bci=20, line=610 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=21, line=721 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=6, line=41 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=4, line=33 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=126, line=729 (Compiled frame) - org.apache.spark.serializer.KryoDeserializationStream.readObject(scala.reflect.ClassTag) @bci=8, line=142 (Compiled frame) - org.apache.spark.serializer.DeserializationStream$$anon$1.getNext() @bci=10, line=133 (Compiled frame) - org.apache.spark.util.NextIterator.hasNext() @bci=16, line=71 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - scala.collection.Iterator$$anon$13.hasNext() @bci=4, line=371 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - org.apache.spark.InterruptibleIterator.hasNext() @bci=22, line=39 (Compiled frame) - scala.collection.Iterator$$anon$11.hasNext() @bci=4, line=327 (Compiled frame) - org.apache.spark.util.collection.ExternalSorter.insertAll(scala.collection.Iterator) @bci=191, line=217 (Compiled frame) - org.apache.spark.shuffle.hash.HashShuffleReader.read() @bci=278, line=61 (Interpreted frame) - org.apache.spark.rdd.ShuffledRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=46, line=92 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.UnionRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=22, line=87 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition,
[jira] [Commented] (HIVE-8626) Extend HDFS super-user checks to dropPartitions
[ https://issues.apache.org/jira/browse/HIVE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338798#comment-14338798 ] Thejas M Nair commented on HIVE-8626: - [~mithun] Can you please rebase this patch ? Extend HDFS super-user checks to dropPartitions --- Key: HIVE-8626 URL: https://issues.apache.org/jira/browse/HIVE-8626 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0, 0.13.1 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-8626.1.patch HIVE-6392 takes care of allowing HDFS super-user accounts to register partitions in tables whose HDFS paths don't explicitly grant write-permissions to the super-user. However, the dropPartitions()/dropTable()/dropDatabase() use-cases don't handle this at all. i.e. An HDFS super-user ({{kal...@dev.grid.myth.net}}) can't drop the very partitions that were added to a table-directory owned by the user ({{mithunr}}). The following error is the result: {quote} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table metadata not deleted since hdfs://mythcluster-nn1.grid.myth.net:8020/user/mithunr/myth.db/myth_table is not writable by kal...@dev.grid.myth.net) {quote} This is the result of redundant checks in {{HiveMetaStore::dropPartitionsAndGetLocations()}}: {code:title=HiveMetaStore.java|borderStyle=solid} if (!wh.isWritable(partPath.getParent())) { throw new MetaException(Table metadata not deleted since the partition + Warehouse.makePartName(partitionKeys, part.getValues()) + has parent location + partPath.getParent() + which is not writable + by + hiveConf.getUser()); } {code} This check is already made in StorageBasedAuthorizationProvider. If the argument is that the SBAP isn't guaranteed to be in play, then this shouldn't be checked in HMS either. If HDFS permissions need to be checked in addition to say, ACLs, then perhaps a recursively-composed auth-provider ought to be used. For the moment, I'll get {{Warehouse.isWritable()}} to handle HDFS super-users. But I think {{isWritable()}} checks oughtn't to be in HiveMetaStore. (Perhaps fix this in another JIRA?) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9796) CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-9796. Resolution: Fixed Committed to branch. Thanks, Jesus! CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch] -- Key: HIVE-9796 URL: https://issues.apache.org/jira/browse/HIVE-9796 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9796.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9743) incorrect result set for left outer join when executed with tez versus mapreduce
[ https://issues.apache.org/jira/browse/HIVE-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338839#comment-14338839 ] Hive QA commented on HIVE-9743: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701098/HIVE-9743.03.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7569 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_left_outer_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2883/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2883/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2883/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12701098 - PreCommit-HIVE-TRUNK-Build incorrect result set for left outer join when executed with tez versus mapreduce Key: HIVE-9743 URL: https://issues.apache.org/jira/browse/HIVE-9743 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: N Campbell Assignee: Matt McCline Attachments: HIVE-9743.01.patch, HIVE-9743.02.patch, HIVE-9743.03.patch This query is supposed to return 3 rows and will when run without Tez but returns 2 rows when run with Tez. select tjoin1.rnum, tjoin1.c1, tjoin1.c2, tjoin2.c2 as c2j2 from tjoin1 left outer join tjoin2 on ( tjoin1.c1 = tjoin2.c1 and tjoin1.c2 15 ) tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 1 20 25 null 2 null 50 null instead of tjoin1.rnum tjoin1.c1 tjoin1.c2 c2j2 0 10 15 null 1 20 25 null 2 null 50 null create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) STORED AS orc ; 0|10|15 1|20|25 2|\N|50 create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE ; 0|10|BB 1|15|DD 2|\N|EE 3|10|FF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8645) RecordReaderImpl#TimestampTreeReader#nextVector() should call TreeReader#nextVector()
[ https://issues.apache.org/jira/browse/HIVE-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HIVE-8645. -- Resolution: Later RecordReaderImpl#TimestampTreeReader#nextVector() should call TreeReader#nextVector() - Key: HIVE-8645 URL: https://issues.apache.org/jira/browse/HIVE-8645 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor {code} Object nextVector(Object previousVector, long batchSize) throws IOException { LongColumnVector result = null; {code} Call to TreeReader.nextVector() is missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9796) CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338804#comment-14338804 ] Ashutosh Chauhan commented on HIVE-9796: Its ok to have this check, but in Hive all columns are nullable and Hive declares so to Calcite. CBO (Calcite Return Path): Add field nullable check to HiveJoinAddNotNullRule [CBO branch] -- Key: HIVE-9796 URL: https://issues.apache.org/jira/browse/HIVE-9796 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9796.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-6617: -- Attachment: HIVE-6617.20.patch The patch with new set of keywords passed. Now I attached a patch with a configuration to see if we can go back to the original set of keywords. Reduce ambiguity in grammar --- Key: HIVE-6617 URL: https://issues.apache.org/jira/browse/HIVE-6617 Project: Hive Issue Type: Task Reporter: Ashutosh Chauhan Assignee: Pengcheng Xiong Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, HIVE-6617.18.patch, HIVE-6617.19.patch, HIVE-6617.20.patch CLEAR LIBRARY CACHE As of today, antlr reports 214 warnings. Need to bring down this number, ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9509) Restore partition spec validation removed by HIVE-9445
[ https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9509: Fix Version/s: 1.0.1 Restore partition spec validation removed by HIVE-9445 -- Key: HIVE-9509 URL: https://issues.apache.org/jira/browse/HIVE-9509 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.2.0, 1.0.1 Attachments: HIVE-9509.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9509) Restore partition spec validation removed by HIVE-9445
[ https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338877#comment-14338877 ] Thejas M Nair commented on HIVE-9509: - Patch committed to branch-1.0 as well. [~brocknoland] Would you like to include this in branch-1.1 ? Restore partition spec validation removed by HIVE-9445 -- Key: HIVE-9509 URL: https://issues.apache.org/jira/browse/HIVE-9509 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.2.0, 1.0.1 Attachments: HIVE-9509.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9445) Revert HIVE-5700 - enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9445: Fix Version/s: 1.0.1 Revert HIVE-5700 - enforce single date format for partition column storage -- Key: HIVE-9445 URL: https://issues.apache.org/jira/browse/HIVE-9445 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0, 0.14.1 Reporter: Brock Noland Assignee: Brock Noland Priority: Blocker Fix For: 1.1.0, 1.0.1 Attachments: HIVE-9445.1.patch, HIVE-9445.1.patch HIVE-5700 has the following issues: * HIVE-8730 - fails mysql upgrades * Does not upgrade all metadata, e.g. {{PARTITIONS.PART_NAME}} See comments in HIVE-5700. * Completely corrupts postgres, see below. With a postgres metastore on 0.12, I executed the following: {noformat} CREATE TABLE HIVE5700_DATE_PARTED (line string) PARTITIONED BY (ddate date); CREATE TABLE HIVE5700_STRING_PARTED (line string) PARTITIONED BY (ddate string); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='2015-01-23'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='2015-01-23'); hive show partitions HIVE5700_DATE_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.052 seconds, Fetched: 4 row(s) hive show partitions HIVE5700_STRING_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.051 seconds, Fetched: 4 row(s) {noformat} I then took a dump of the database named {{postgres-pre-upgrade.sql}} and the data in the dump looks good: {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-pre-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5 1421943648 0 ddate=20150122 8 2 6 1421943664 0 ddate=NOT_DATE 9 3 7 1421943664 0 ddate=20150121 10 3 8 1421943665 0 ddate=20150122 11 3 9 1421943694 0 ddate=2015-01-2312 2 101421943695 0 ddate=2015-01-2313 3 \. -- COPY PARTITION_KEY_VALS (PART_ID, PART_KEY_VAL, INTEGER_IDX) FROM stdin; 3 NOT_DATE0 4 201501210 5 201501220 6 NOT_DATE0 7 201501210 8 201501220 9 2015-01-23 0 102015-01-23 0 \. {noformat} I then upgraded to 0.13 and subsequently upgraded the MS with the following command: {{schematool -dbType postgres -upgradeSchema -verbose}} The file {{postgres-post-upgrade.sql}} is the post-upgrade db dump. As you can see the data is completely corrupt. {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-post-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5 1421943648 0 ddate=20150122 8 2 6 1421943664 0 ddate=NOT_DATE 9 3 7 1421943664 0 ddate=20150121 10 3 8 1421943665 0 ddate=20150122 11
[jira] [Commented] (HIVE-9432) CBO (Calcite Return Path): Removing QB from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338974#comment-14338974 ] Ashutosh Chauhan commented on HIVE-9432: +1 CBO (Calcite Return Path): Removing QB from ParseContext Key: HIVE-9432 URL: https://issues.apache.org/jira/browse/HIVE-9432 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-9432.01.patch, HIVE-9432.02.patch, HIVE-9432.03.patch, HIVE-9432.04.patch, HIVE-9432.05.patch, HIVE-9432.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9731) WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified
[ https://issues.apache.org/jira/browse/HIVE-9731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9731: - Labels: TODOC1.2 (was: ) WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified Key: HIVE-9731 URL: https://issues.apache.org/jira/browse/HIVE-9731 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9731.1.patch, HIVE-9731.2.patch, HIVE-9731.3.patch Hadoop Streaming allows -inputreader parameter to specify use of StreamXmlRecorderReader (example) hadoop jar hadoop-streaming-2.5.1.jar \ -inputreader StreamXmlRecord,begin=BEGIN_STRING,end=END_STRING \ (rest of the command) WebHCat's StreamingDelegator does not include -inputreader as a valid option when submitting jobs to http://www.myserver.com/templeton/v1/mapreduce/streaming endpoint. If -inputreader is specified and passed to templeton server (perhaps via CURL operation) , it will get truncated and not passed as parameter from TempletonControllerJob to Hadoop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9731) WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified
[ https://issues.apache.org/jira/browse/HIVE-9731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339085#comment-14339085 ] Lefty Leverenz commented on HIVE-9731: -- Doc note: Document the -inputreader parameter in the WebHCat mapreduce/streaming wikidoc for release 1.2.0. * [WebHCat Reference -- MapReduce Streaming -- Parameters | https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+MapReduceStream#WebHCatReferenceMapReduceStream-Parameters] * [(optional) MapReduce Streaming -- Examples | https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+MapReduceStream#WebHCatReferenceMapReduceStream-Example] WebHCat MapReduce Streaming Job does not allow StreamXmlRecordReader to be specified Key: HIVE-9731 URL: https://issues.apache.org/jira/browse/HIVE-9731 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9731.1.patch, HIVE-9731.2.patch, HIVE-9731.3.patch Hadoop Streaming allows -inputreader parameter to specify use of StreamXmlRecorderReader (example) hadoop jar hadoop-streaming-2.5.1.jar \ -inputreader StreamXmlRecord,begin=BEGIN_STRING,end=END_STRING \ (rest of the command) WebHCat's StreamingDelegator does not include -inputreader as a valid option when submitting jobs to http://www.myserver.com/templeton/v1/mapreduce/streaming endpoint. If -inputreader is specified and passed to templeton server (perhaps via CURL operation) , it will get truncated and not passed as parameter from TempletonControllerJob to Hadoop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9797) Need update some spark tests for java 8
[ https://issues.apache.org/jira/browse/HIVE-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338989#comment-14338989 ] Hive QA commented on HIVE-9797: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701112/HIVE-9797.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7568 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2884/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2884/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2884/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12701112 - PreCommit-HIVE-TRUNK-Build Need update some spark tests for java 8 --- Key: HIVE-9797 URL: https://issues.apache.org/jira/browse/HIVE-9797 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9797.1.patch The following tests fail on a java 8 environment: TestMiniSparkOnYarnCliDriver.list_bucket_dml_10 TestSparkCliDriver.outer_join_ppr TestSparkCliDriver.vector_cast_constant -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9445) Revert HIVE-5700 - enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338891#comment-14338891 ] Thejas M Nair commented on HIVE-9445: - Merged this and HIVE-9509 branch-1.0 as well. Revert HIVE-5700 - enforce single date format for partition column storage -- Key: HIVE-9445 URL: https://issues.apache.org/jira/browse/HIVE-9445 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0, 0.14.1 Reporter: Brock Noland Assignee: Brock Noland Priority: Blocker Fix For: 1.1.0, 1.0.1 Attachments: HIVE-9445.1.patch, HIVE-9445.1.patch HIVE-5700 has the following issues: * HIVE-8730 - fails mysql upgrades * Does not upgrade all metadata, e.g. {{PARTITIONS.PART_NAME}} See comments in HIVE-5700. * Completely corrupts postgres, see below. With a postgres metastore on 0.12, I executed the following: {noformat} CREATE TABLE HIVE5700_DATE_PARTED (line string) PARTITIONED BY (ddate date); CREATE TABLE HIVE5700_STRING_PARTED (line string) PARTITIONED BY (ddate string); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='2015-01-23'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='2015-01-23'); hive show partitions HIVE5700_DATE_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.052 seconds, Fetched: 4 row(s) hive show partitions HIVE5700_STRING_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.051 seconds, Fetched: 4 row(s) {noformat} I then took a dump of the database named {{postgres-pre-upgrade.sql}} and the data in the dump looks good: {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-pre-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5 1421943648 0 ddate=20150122 8 2 6 1421943664 0 ddate=NOT_DATE 9 3 7 1421943664 0 ddate=20150121 10 3 8 1421943665 0 ddate=20150122 11 3 9 1421943694 0 ddate=2015-01-2312 2 101421943695 0 ddate=2015-01-2313 3 \. -- COPY PARTITION_KEY_VALS (PART_ID, PART_KEY_VAL, INTEGER_IDX) FROM stdin; 3 NOT_DATE0 4 201501210 5 201501220 6 NOT_DATE0 7 201501210 8 201501220 9 2015-01-23 0 102015-01-23 0 \. {noformat} I then upgraded to 0.13 and subsequently upgraded the MS with the following command: {{schematool -dbType postgres -upgradeSchema -verbose}} The file {{postgres-post-upgrade.sql}} is the post-upgrade db dump. As you can see the data is completely corrupt. {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-post-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5 1421943648 0 ddate=20150122 8 2 6 1421943664 0 ddate=NOT_DATE 9 3 7 1421943664 0
[jira] [Updated] (HIVE-9799) LLAP: config not passed to daemon init
[ https://issues.apache.org/jira/browse/HIVE-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9799: --- Fix Version/s: llap LLAP: config not passed to daemon init -- Key: HIVE-9799 URL: https://issues.apache.org/jira/browse/HIVE-9799 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9799) LLAP: config not passed to daemon init
[ https://issues.apache.org/jira/browse/HIVE-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-9799. Resolution: Fixed LLAP: config not passed to daemon init -- Key: HIVE-9799 URL: https://issues.apache.org/jira/browse/HIVE-9799 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Fix For: llap Attachments: HIVE-9799.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9741) Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section
[ https://issues.apache.org/jira/browse/HIVE-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HIVE-9741: Attachment: HIVE-9741.7.patch V7 by adding doDbSpecificInitializationsBeforeQuery in runTestQuery Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section -- Key: HIVE-9741 URL: https://issues.apache.org/jira/browse/HIVE-9741 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9741.1.patch, HIVE-9741.2.patch, HIVE-9741.3.patch, HIVE-9741.4.patch, HIVE-9741.5.patch, HIVE-9741.6.patch, HIVE-9741.7.patch MetaStoreDirectSql constructor is querying DB to determine dbType, which leads to too many DB queries to make megastore slow as ObjectStore.setConf might be called frequently. Moreover, ObjectStore.setConf begins/ends with lock acquire/release, if the underlying DB hangs somehow, lock is never released and all hereafter incoming requests are blocked. Two points: 1. Using getProductName based JDBC driver to get dbType info. 2. Since metastore auto-creaton is disabled by default, it'd better bypass ensureDbInit() and runTestQuery() in order to avoid DB queries within critical section of setConf. Here’s stack trace: MetaStoreDirectSql.determineDbType(...) MetaStoreDirectSql.MetaStoreDirectSql(...) ObjectStore.initialize(...) ObjectStore.setConf(…) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9741) Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section
[ https://issues.apache.org/jira/browse/HIVE-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339146#comment-14339146 ] Sergey Shelukhin commented on HIVE-9741: +1 Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section -- Key: HIVE-9741 URL: https://issues.apache.org/jira/browse/HIVE-9741 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9741.1.patch, HIVE-9741.2.patch, HIVE-9741.3.patch, HIVE-9741.4.patch, HIVE-9741.5.patch, HIVE-9741.6.patch, HIVE-9741.7.patch MetaStoreDirectSql constructor is querying DB to determine dbType, which leads to too many DB queries to make megastore slow as ObjectStore.setConf might be called frequently. Moreover, ObjectStore.setConf begins/ends with lock acquire/release, if the underlying DB hangs somehow, lock is never released and all hereafter incoming requests are blocked. Two points: 1. Using getProductName based JDBC driver to get dbType info. 2. Since metastore auto-creaton is disabled by default, it'd better bypass ensureDbInit() and runTestQuery() in order to avoid DB queries within critical section of setConf. Here’s stack trace: MetaStoreDirectSql.determineDbType(...) MetaStoreDirectSql.MetaStoreDirectSql(...) ObjectStore.initialize(...) ObjectStore.setConf(…) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339177#comment-14339177 ] Xiaobing Zhou commented on HIVE-9582: - [~thiruvel] When will this patch be committed to trunk? HIVE-9642 is expecting this patch to work properly. Thanks. HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 0.14.1 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls
[ https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339176#comment-14339176 ] Xiaobing Zhou commented on HIVE-9642: - Thanks [~thejas]. Let's wait for HIVE-9582 committed so that I can make a patch based on that, otherwise, duplicate efforts. Hive metastore client retries don't happen consistently for all api calls - Key: HIVE-9642 URL: https://issues.apache.org/jira/browse/HIVE-9642 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch When org.apache.thrift.transport.TTransportException is thrown for issues like socket timeout, the retry via RetryingMetaStoreClient happens only in certain cases. Retry happens for the getDatabase call in but not for getAllDatabases(). The reason is RetryingMetaStoreClient checks for TTransportException being the cause for InvocationTargetException. But in case of some calls such as getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a MetaException. We should remove this unnecessary wrapping of exceptions for certain functions in HMC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9799) LLAP: config not passed to daemon init
[ https://issues.apache.org/jira/browse/HIVE-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9799: --- Attachment: HIVE-9799.patch [~hagleitn] [~sseth] fyi LLAP: config not passed to daemon init -- Key: HIVE-9799 URL: https://issues.apache.org/jira/browse/HIVE-9799 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Fix For: llap Attachments: HIVE-9799.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9797) Need update some spark tests for java 8
[ https://issues.apache.org/jira/browse/HIVE-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9797: -- Attachment: HIVE-9797.2.patch Need update some spark tests for java 8 --- Key: HIVE-9797 URL: https://issues.apache.org/jira/browse/HIVE-9797 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9797.1.patch, HIVE-9797.2.patch The following tests fail on a java 8 environment: TestMiniSparkOnYarnCliDriver.list_bucket_dml_10 TestSparkCliDriver.outer_join_ppr TestSparkCliDriver.vector_cast_constant -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9797) Need update some spark tests for java 8
[ https://issues.apache.org/jira/browse/HIVE-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339178#comment-14339178 ] Brock Noland commented on HIVE-9797: +1 Need update some spark tests for java 8 --- Key: HIVE-9797 URL: https://issues.apache.org/jira/browse/HIVE-9797 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9797.1.patch, HIVE-9797.2.patch The following tests fail on a java 8 environment: TestMiniSparkOnYarnCliDriver.list_bucket_dml_10 TestSparkCliDriver.outer_join_ppr TestSparkCliDriver.vector_cast_constant -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9744) Move common arguments validation and value extraction code to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9744: -- Attachment: HIVE-9744.2.patch patch #2 Move common arguments validation and value extraction code to GenericUDF Key: HIVE-9744 URL: https://issues.apache.org/jira/browse/HIVE-9744 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9744.1.patch, HIVE-9744.2.patch most of the UDFs - check if arguments are primitive / complex - check if arguments are particular type or type_group - get converters to read values - check if argument is constant - extract arguments values Probably we should move these common methods to GenericUDF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9781) Utilize spark.kryo.classesToRegister [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-9781: -- Attachment: HIVE-9781.3.patch Utilize spark.kryo.classesToRegister [Spark Branch] --- Key: HIVE-9781 URL: https://issues.apache.org/jira/browse/HIVE-9781 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Attachments: HIVE-9781.1.patch, HIVE-9781.2.patch, HIVE-9781.3.patch I noticed in several thread dumps that it appears kyro is serializing the class names associated with our keys and values. Kyro supports pre-registering classes so that you don't have to serialize the class name and spark supports this via the {{spark.kryo.registrator}} property. We should do this so we don't have to serialize class names. {noformat} Thread 12154: (state = BLOCKED) - java.lang.Object.hashCode() @bci=0 (Compiled frame; information may be imprecise) - com.esotericsoftware.kryo.util.ObjectMap.get(java.lang.Object) @bci=1, line=265 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.getRegistration(java.lang.Class) @bci=18, line=61 (Compiled frame) - com.esotericsoftware.kryo.Kryo.getRegistration(java.lang.Class) @bci=20, line=429 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readName(com.esotericsoftware.kryo.io.Input) @bci=242, line=148 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(com.esotericsoftware.kryo.io.Input) @bci=65, line=115 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClass(com.esotericsoftware.kryo.io.Input) @bci=20, line=610 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=21, line=721 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=6, line=41 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=4, line=33 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=126, line=729 (Compiled frame) - org.apache.spark.serializer.KryoDeserializationStream.readObject(scala.reflect.ClassTag) @bci=8, line=142 (Compiled frame) - org.apache.spark.serializer.DeserializationStream$$anon$1.getNext() @bci=10, line=133 (Compiled frame) - org.apache.spark.util.NextIterator.hasNext() @bci=16, line=71 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - scala.collection.Iterator$$anon$13.hasNext() @bci=4, line=371 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - org.apache.spark.InterruptibleIterator.hasNext() @bci=22, line=39 (Compiled frame) - scala.collection.Iterator$$anon$11.hasNext() @bci=4, line=327 (Compiled frame) - org.apache.spark.util.collection.ExternalSorter.insertAll(scala.collection.Iterator) @bci=191, line=217 (Compiled frame) - org.apache.spark.shuffle.hash.HashShuffleReader.read() @bci=278, line=61 (Interpreted frame) - org.apache.spark.rdd.ShuffledRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=46, line=92 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.UnionRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=22, line=87 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition,
[jira] [Commented] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls
[ https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339216#comment-14339216 ] Thejas M Nair commented on HIVE-9642: - Looks like the original change to add retries (HIVE-3400) didn't add a test case . Hive metastore client retries don't happen consistently for all api calls - Key: HIVE-9642 URL: https://issues.apache.org/jira/browse/HIVE-9642 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch When org.apache.thrift.transport.TTransportException is thrown for issues like socket timeout, the retry via RetryingMetaStoreClient happens only in certain cases. Retry happens for the getDatabase call in but not for getAllDatabases(). The reason is RetryingMetaStoreClient checks for TTransportException being the cause for InvocationTargetException. But in case of some calls such as getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a MetaException. We should remove this unnecessary wrapping of exceptions for certain functions in HMC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9800: -- Attachment: HIVE-9800.1.patch Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls
[ https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339215#comment-14339215 ] Thejas M Nair commented on HIVE-9642: - We should also add a unit test for this. ie, - # bring up two metstore servers # make an api call # bring down first server (thats one it connects to) # make the api call again # bring down 2nd server # bring up first server # make the api call again See TestHiveMetaStorePartitionSpecs.RunMS on starting metastore server. Hive metastore client retries don't happen consistently for all api calls - Key: HIVE-9642 URL: https://issues.apache.org/jira/browse/HIVE-9642 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch When org.apache.thrift.transport.TTransportException is thrown for issues like socket timeout, the retry via RetryingMetaStoreClient happens only in certain cases. Retry happens for the getDatabase call in but not for getAllDatabases(). The reason is RetryingMetaStoreClient checks for TTransportException being the cause for InvocationTargetException. But in case of some calls such as getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a MetaException. We should remove this unnecessary wrapping of exceptions for certain functions in HMC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9800: -- Attachment: (was: HIVE-9800.1.patch) Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9800: -- Attachment: HIVE-9800.1.patch Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9738) create SOUNDEX udf
[ https://issues.apache.org/jira/browse/HIVE-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9738: -- Attachment: HIVE-9738.2.patch patch #2 create SOUNDEX udf -- Key: HIVE-9738 URL: https://issues.apache.org/jira/browse/HIVE-9738 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-9738.1.patch, HIVE-9738.2.patch Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes. The American Soundex System The soundex code consist of the first letter of the name followed by three digits. These three digits are determined by dropping the letters a, e, i, o, u, h, w and y and adding three digits from the remaining letters of the name according to the table below. There are only two additional rules. (1) If two or more consecutive letters have the same code, they are coded as one letter. (2) If there are an insufficient numbers of letters to make the three digits, the remaining digits are set to zero. Soundex Table 1 b,f,p,v 2 c,g,j,k,q,s,x,z 3 d, t 4 l 5 m, n 6 r Examples: Miller M460 Peterson P362 Peters P362 Auerbach A612 Uhrbach U612 Moskowitz M232 Moskovitz M213 Implementation: http://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/Soundex.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339243#comment-14339243 ] Sergio Peña commented on HIVE-9800: --- Could you please review? [~brocknoland] Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8119) Implement Date in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338087#comment-14338087 ] Hive QA commented on HIVE-8119: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12700982/HIVE-8119.2.patch {color:green}SUCCESS:{color} +1 7567 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2877/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2877/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2877/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12700982 - PreCommit-HIVE-TRUNK-Build Implement Date in ParquetSerde -- Key: HIVE-8119 URL: https://issues.apache.org/jira/browse/HIVE-8119 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Dong Chen Attachments: HIVE-8119.1.patch, HIVE-8119.2.patch, HIVE-8119.patch Date type in Parquet is discussed here: http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338173#comment-14338173 ] Hive QA commented on HIVE-6617: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12700992/HIVE-6617.19.patch {color:green}SUCCESS:{color} +1 7710 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2878/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2878/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2878/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12700992 - PreCommit-HIVE-TRUNK-Build Reduce ambiguity in grammar --- Key: HIVE-6617 URL: https://issues.apache.org/jira/browse/HIVE-6617 Project: Hive Issue Type: Task Reporter: Ashutosh Chauhan Assignee: Pengcheng Xiong Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, HIVE-6617.18.patch, HIVE-6617.19.patch CLEAR LIBRARY CACHE As of today, antlr reports 214 warnings. Need to bring down this number, ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9657) Use new parquet Types API builder to construct data types
[ https://issues.apache.org/jira/browse/HIVE-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338246#comment-14338246 ] Hive QA commented on HIVE-9657: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701003/HIVE-9657.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7567 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2879/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2879/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2879/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12701003 - PreCommit-HIVE-TRUNK-Build Use new parquet Types API builder to construct data types - Key: HIVE-9657 URL: https://issues.apache.org/jira/browse/HIVE-9657 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Ferdinand Xu Attachments: HIVE-9657.1.patch, HIVE-9657.patch Parquet is going to remove the constructors from the public API in favor of the builder, We must use the new Types API for primitive types in: {noformat}HiveSchemaConverter.java{noformat}. This is to avoid invalid types, like an INT64 with a DATE annotation. An example for a DATE datatype: {noformat} Types.primitive(repetition, INT32).as(DATE).named(name); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9543) MetaException(message:Metastore contains multiple versions)
[ https://issues.apache.org/jira/browse/HIVE-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338069#comment-14338069 ] Junyong Li commented on HIVE-9543: -- You are right, but this exception always happens. I am sure The meta already has a verion record before this exceptin happen and after i delete the duplicate record in the mysql table manually everything is ok, But the exception will happen within a few hours. So can you give some else clue? MetaException(message:Metastore contains multiple versions) --- Key: HIVE-9543 URL: https://issues.apache.org/jira/browse/HIVE-9543 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.1 Reporter: Junyong Li When i run bin/hive command, i got the following exception: {noformat} Logging initialized using configuration in jar:file:/home/hadoop/apache-hive-0.13.1-bin/lib/hive-common-0.13.1.jar!/hive-log4j.properties Exception in thread main java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:62) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340) ... 7 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410) ... 12 more Caused by: MetaException(message:Metastore contains multiple versions) at org.apache.hadoop.hive.metastore.ObjectStore.getMSchemaVersion(ObjectStore.java:6368) at org.apache.hadoop.hive.metastore.ObjectStore.getMetaStoreSchemaVersion(ObjectStore.java:6330) at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6289) at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6277) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) at com.sun.proxy.$Proxy9.verifySchema(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:476) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:356) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.init(RetryingHMSHandler.java:54) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59) at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.init(HiveMetaStoreClient.java:171) ... 17 more {noformat} And i have found two record in metastore table VERSION. after reading source code, i found following code maybe cause the problem: In the
[jira] [Updated] (HIVE-9657) Use new parquet Types API builder to construct data types
[ https://issues.apache.org/jira/browse/HIVE-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9657: --- Attachment: HIVE-9657.1.patch Use new parquet Types API builder to construct data types - Key: HIVE-9657 URL: https://issues.apache.org/jira/browse/HIVE-9657 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Ferdinand Xu Attachments: HIVE-9657.1.patch, HIVE-9657.patch Parquet is going to remove the constructors from the public API in favor of the builder, We must use the new Types API for primitive types in: {noformat}HiveSchemaConverter.java{noformat}. This is to avoid invalid types, like an INT64 with a DATE annotation. An example for a DATE datatype: {noformat} Types.primitive(repetition, INT32).as(DATE).named(name); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository
[ https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338279#comment-14338279 ] Anant Nag commented on HIVE-9664: - [~appodictic], [~leftylev]: Thank you for your input. I'm making use of the grapes internally to download resources from the artifactory. Following are the changes that will be made to the commands: Add resource command: Adding resources from hdfs and local filesystem works as before. Resources can be added from the artifactory using the following: Syntax: {code} add jar ivy://org:module:version?query_string jar ivy://org:module:version?query_string* {code} group - Which module group the module comes from. Translates directly to a Maven groupId or an Ivy Organization. module - The name of the module to load. Translated directly to a Maven artifactId or an Ivy artifact. version - The version of the module to use. Any version or * (for latest) or an Ivy Range '[2.2.1,)' meaning 2.2.1 or any greater version) can be used. Various parameters can be passed in the url to configure how and which jars are added to the artifactory. The parameters are in the form of key value pairs where key is the name of the paramter and the value is the value of the parameter. These key-value pairs should be separated by . example: {code} add jar ivy://org:module:version?key=valuekey1=value1key2=value2; {code} The different parameters that can be passed are: 1) exclude: Takes a comma separated value of the form (org:module) Sometimes you will want to exclude transitive dependencies as you might be already using a slightly different but compatible version of some artifact. exclude can be defined in the query string. It takes a comma(,) separated values of the form org:module, all these dependencies won't be downloaded from the artifactory. Usage: {code} add jar ivy://org:module:version?exclude=org1:module1,org2:module2,org3:module3; {code} Example: {code} add jar ivy://org.apache.pig:pig:0.10.0?exclude=org.apache.hadoop:avro; {code} 2) transitive: Takes values true or false. Defaults to true. When transitive = true, all the transitive dependencies are downloaded and added to the classpath. Usage: {code} add jar ivy://org:module:version?transitive=trueexclude=org1:module1 add jar ivy://org.module:version?transitive=false; {code} example: {code} add jar ivy://org.apache.pig:pig:0.10.0?exclude=org.apache.hadoop:avrotransitive=true; add jar ivy://org.apache.pig:pig:0.10.0?transitive=false {code} 3) ext: The extension of the file to add. jar by default. example: {code} add jar ivy://org.apache.pig:pig:0.10.0?ext=jar add jar ivy://com.linkedin.informed-bridge-jira:informed-bridge-jira:0.0.3?classifier=docs-jsonext=tar.gz; {code} 4) classifier: The maven classifier to resolve by example: {code} add jar ivy://com.linkedin.informed-bridge-jira:informed-bridge-jira:0.0.3?classifier=docs-jsonext=tar.gz; {code} Delete resource command: Delete resource works as before for the resources added from hdfs and local file system. Resources added from the artifactory can be deleted using the following command. {code} delete jar ivy://org:module:version {code} The delete jar command will delete all the transitive dependencies of the jar which was added using the same org:module:version. If two jars share a set of transitive dependencies and one of the jars is deleted using the above syntax, then all the transitive dependencies will be deleted for the jar except the ones which are shared. Example: {code} add jar ivy://org.apache.pig:pig:0.10.0 add jar ivy://org.apache.pig:pig:0.11.1.15 delete jar ivy://org.apache.pig:pig:0.10.0 {code} if A is the set containing the transitive dependencies of pig-0.10.0 and B is the set containing the transitive dependencies of pig-0.11.1.15, then after executing the above commands, A-(A intersection B) will be deleted. Now on executing {code} delete jar ivy://org.apache.pig:pig:0.11.1.15 {code} all the remaining dependencies will be deleted. List resource: works as before. Hive add jar command should be able to download and add jars from a repository Key: HIVE-9664 URL: https://issues.apache.org/jira/browse/HIVE-9664 Project: Hive Issue Type: Improvement Reporter: Anant Nag Labels: hive Currently Hive's add jar command takes a local path to the dependency jar. This clutters the local file-system as users may forget to remove this jar later It would be nice if Hive supported a Gradle like notation to download the jar from a repository. Example: add jar org:module:version It should also be backward compatible and should take jar from the local file-system as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339388#comment-14339388 ] Brock Noland commented on HIVE-9800: These scripts are going to be dependent on ubuntu at the moment. I think if we want to go cross distro we can rewrite it in python. Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9738) create SOUNDEX udf
[ https://issues.apache.org/jira/browse/HIVE-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339409#comment-14339409 ] Jason Dere commented on HIVE-9738: -- +1 if precommit tests still look good create SOUNDEX udf -- Key: HIVE-9738 URL: https://issues.apache.org/jira/browse/HIVE-9738 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-9738.1.patch, HIVE-9738.2.patch Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes. The American Soundex System The soundex code consist of the first letter of the name followed by three digits. These three digits are determined by dropping the letters a, e, i, o, u, h, w and y and adding three digits from the remaining letters of the name according to the table below. There are only two additional rules. (1) If two or more consecutive letters have the same code, they are coded as one letter. (2) If there are an insufficient numbers of letters to make the three digits, the remaining digits are set to zero. Soundex Table 1 b,f,p,v 2 c,g,j,k,q,s,x,z 3 d, t 4 l 5 m, n 6 r Examples: Miller M460 Peterson P362 Peters P362 Auerbach A612 Uhrbach U612 Moskowitz M232 Moskovitz M213 Implementation: http://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/Soundex.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9800: -- Attachment: (was: HIVE-9800.1.patch) Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9741) Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section
[ https://issues.apache.org/jira/browse/HIVE-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339426#comment-14339426 ] Hive QA commented on HIVE-9741: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701165/HIVE-9741.7.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7568 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2887/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2887/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2887/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12701165 - PreCommit-HIVE-TRUNK-Build Refactor MetaStoreDirectSql constructor by removing DB queries out of critical section -- Key: HIVE-9741 URL: https://issues.apache.org/jira/browse/HIVE-9741 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0 Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Attachments: HIVE-9741.1.patch, HIVE-9741.2.patch, HIVE-9741.3.patch, HIVE-9741.4.patch, HIVE-9741.5.patch, HIVE-9741.6.patch, HIVE-9741.7.patch MetaStoreDirectSql constructor is querying DB to determine dbType, which leads to too many DB queries to make megastore slow as ObjectStore.setConf might be called frequently. Moreover, ObjectStore.setConf begins/ends with lock acquire/release, if the underlying DB hangs somehow, lock is never released and all hereafter incoming requests are blocked. Two points: 1. Using getProductName based JDBC driver to get dbType info. 2. Since metastore auto-creaton is disabled by default, it'd better bypass ensureDbInit() and runTestQuery() in order to avoid DB queries within critical section of setConf. Here’s stack trace: MetaStoreDirectSql.determineDbType(...) MetaStoreDirectSql.MetaStoreDirectSql(...) ObjectStore.initialize(...) ObjectStore.setConf(…) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9277) Hybrid Hybrid Grace Hash Join
[ https://issues.apache.org/jira/browse/HIVE-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-9277: Attachment: HIVE-9277.04.patch Uploading 4th patch for testing Hybrid Hybrid Grace Hash Join - Key: HIVE-9277 URL: https://issues.apache.org/jira/browse/HIVE-9277 Project: Hive Issue Type: New Feature Components: Physical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Labels: join Attachments: HIVE-9277.01.patch, HIVE-9277.02.patch, HIVE-9277.03.patch, HIVE-9277.04.patch, High-leveldesignforHybridHybridGraceHashJoinv1.0.pdf We are proposing an enhanced hash join algorithm called _“hybrid hybrid grace hash join”_. We can benefit from this feature as illustrated below: * The query will not fail even if the estimated memory requirement is slightly wrong * Expensive garbage collection overhead can be avoided when hash table grows * Join execution using a Map join operator even though the small table doesn't fit in memory as spilling some data from the build and probe sides will still be cheaper than having to shuffle the large fact table The design was based on Hadoop’s parallel processing capability and significant amount of memory available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9803) SparkClientImpl should not attempt impersonation in CLI mode [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9803: --- Attachment: HIVE-9803.patch SparkClientImpl should not attempt impersonation in CLI mode [Spark Branch] --- Key: HIVE-9803 URL: https://issues.apache.org/jira/browse/HIVE-9803 Project: Hive Issue Type: Bug Components: Hive Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9803.patch My bad. In CLI mode we attempt to impersonate oursevles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9801) LLAP: need counter for cache hit ratio
[ https://issues.apache.org/jira/browse/HIVE-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-9801: --- Assignee: Prasanth Jayachandran LLAP: need counter for cache hit ratio -- Key: HIVE-9801 URL: https://issues.apache.org/jira/browse/HIVE-9801 Project: Hive Issue Type: Sub-task Reporter: Gunther Hagleitner Assignee: Prasanth Jayachandran -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9800) Create scripts to do metastore upgrade tests on Jenkins
[ https://issues.apache.org/jira/browse/HIVE-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9800: -- Attachment: (was: HIVE-9800.1.patch) Create scripts to do metastore upgrade tests on Jenkins --- Key: HIVE-9800 URL: https://issues.apache.org/jira/browse/HIVE-9800 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9800.1.patch In order to have a better quality code for Hive Metastore, we need to create some upgrade scripts that can run on Jenkins nightly or everytime a patch is added to the ticket that makes structural changes on the database. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9794) java.lang.NoSuchMethodError occurs during hive query execution which has 'ADD FILE XXXX.jar' sentence
[ https://issues.apache.org/jira/browse/HIVE-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339492#comment-14339492 ] Xuefu Zhang commented on HIVE-9794: --- cc: [~chengxiang li], [~lirui] java.lang.NoSuchMethodError occurs during hive query execution which has 'ADD FILE .jar' sentence - Key: HIVE-9794 URL: https://issues.apache.org/jira/browse/HIVE-9794 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xin Hao We updated our code to the latest revision on Spark Branch (i.e. fd0f638a8d481a9a98b34d3dd08236d6d591812f) , rebuild and deploy Hive in our cluster and run BigBench case again. Many cases (e.g. Q1, Q2, Q3, Q4, Q8) failed due to a common 'NoSuchMethodError'. The root cause sentence in these queries should be ‘ADD FILE .jar’. Detail error message: Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.session.SessionState.add_resources(Lorg/apache/hadoop/hive/ql/session/SessionState$ResourceType;Ljava/util/List;)Ljava/util/List; at org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:67) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:262) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:419) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:708) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9781) Utilize spark.kryo.classesToRegister [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-9781: -- Attachment: HIVE-9781.4.patch Utilize spark.kryo.classesToRegister [Spark Branch] --- Key: HIVE-9781 URL: https://issues.apache.org/jira/browse/HIVE-9781 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9781.1.patch, HIVE-9781.2.patch, HIVE-9781.3.patch, HIVE-9781.4.patch I noticed in several thread dumps that it appears kyro is serializing the class names associated with our keys and values. Kyro supports pre-registering classes so that you don't have to serialize the class name and spark supports this via the {{spark.kryo.registrator}} property. We should do this so we don't have to serialize class names. {noformat} Thread 12154: (state = BLOCKED) - java.lang.Object.hashCode() @bci=0 (Compiled frame; information may be imprecise) - com.esotericsoftware.kryo.util.ObjectMap.get(java.lang.Object) @bci=1, line=265 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.getRegistration(java.lang.Class) @bci=18, line=61 (Compiled frame) - com.esotericsoftware.kryo.Kryo.getRegistration(java.lang.Class) @bci=20, line=429 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readName(com.esotericsoftware.kryo.io.Input) @bci=242, line=148 (Compiled frame) - com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(com.esotericsoftware.kryo.io.Input) @bci=65, line=115 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClass(com.esotericsoftware.kryo.io.Input) @bci=20, line=610 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=21, line=721 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=6, line=41 (Compiled frame) - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=4, line=33 (Compiled frame) - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=126, line=729 (Compiled frame) - org.apache.spark.serializer.KryoDeserializationStream.readObject(scala.reflect.ClassTag) @bci=8, line=142 (Compiled frame) - org.apache.spark.serializer.DeserializationStream$$anon$1.getNext() @bci=10, line=133 (Compiled frame) - org.apache.spark.util.NextIterator.hasNext() @bci=16, line=71 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - scala.collection.Iterator$$anon$13.hasNext() @bci=4, line=371 (Compiled frame) - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame) - org.apache.spark.InterruptibleIterator.hasNext() @bci=22, line=39 (Compiled frame) - scala.collection.Iterator$$anon$11.hasNext() @bci=4, line=327 (Compiled frame) - org.apache.spark.util.collection.ExternalSorter.insertAll(scala.collection.Iterator) @bci=191, line=217 (Compiled frame) - org.apache.spark.shuffle.hash.HashShuffleReader.read() @bci=278, line=61 (Interpreted frame) - org.apache.spark.rdd.ShuffledRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=46, line=92 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame) - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame) - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame) - org.apache.spark.rdd.UnionRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=22, line=87 (Interpreted frame) -
[jira] [Updated] (HIVE-9793) Remove hard coded paths from cli driver tests
[ https://issues.apache.org/jira/browse/HIVE-9793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9793: --- Attachment: HIVE-9793.patch Took me some time to figure this one out. Remove hard coded paths from cli driver tests - Key: HIVE-9793 URL: https://issues.apache.org/jira/browse/HIVE-9793 Project: Hive Issue Type: Improvement Components: Tests Affects Versions: 1.2.0 Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9793.patch, HIVE-9793.patch, HIVE-9793.patch At some point a change which generates a hard coded path into the test files snuck in. Insert we should use the {{HIVE_ROOT}} directory as this is better for ptest environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339480#comment-14339480 ] Thiruvel Thirumoolan commented on HIVE-9582: [~xiaobingo] I am eager to get this committed. I am waiting for review feedback. [~sushanth] Can you take a look? HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 0.14.1 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339311#comment-14339311 ] Hive QA commented on HIVE-6617: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701131/HIVE-6617.20.patch {color:red}ERROR:{color} -1 due to 127 failed/errored test(s), 7568 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_non_id org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_revoke_table_priv org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_view_sqlstd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explode_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into_with_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_non_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_outer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_null_cast org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_null_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_num_op_type_conv org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_null_check org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_vectorization_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_write_correct_definition_levels org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_type_conversions_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_acos org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_add_months org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_asin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_atan org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case_thrift org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_coalesce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_concat org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_concat_ws org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_cos org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_decode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_elt org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_equal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_field org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_find_in_set org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_from_utc_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_greatest org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_if org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_in_file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_instr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_isnull_isnotnull org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_last_day org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_least org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_levenshtein org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_locate
[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests
[ https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339596#comment-14339596 ] Mohit Sabharwal commented on HIVE-9253: --- Thanks, Dong! LGTM. +1 (non-binding) MetaStore server should support timeout for long running requests - Key: HIVE-9253 URL: https://issues.apache.org/jira/browse/HIVE-9253 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.5.patch, HIVE-9253.6.patch, HIVE-9253.patch In the description of HIVE-7195, one issue is that MetaStore client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. The server should support timeout when the request from client runs a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests
[ https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339610#comment-14339610 ] Brock Noland commented on HIVE-9253: +1 MetaStore server should support timeout for long running requests - Key: HIVE-9253 URL: https://issues.apache.org/jira/browse/HIVE-9253 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.5.patch, HIVE-9253.6.patch, HIVE-9253.patch In the description of HIVE-7195, one issue is that MetaStore client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. The server should support timeout when the request from client runs a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9791) insert into table throws NPE
[ https://issues.apache.org/jira/browse/HIVE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339635#comment-14339635 ] Alexander Pivovarov commented on HIVE-9791: --- probably need to fix smth in HiveParser.g insert into table throws NPE Key: HIVE-9791 URL: https://issues.apache.org/jira/browse/HIVE-9791 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Alexander Pivovarov to reproduce NPE run the following {code} create table a as select 'A' letter; OK insert into table a select 'B' letter; FAILED: NullPointerException null -- works fine if add from table to select statement insert into table a select 'B' letter from dual; OK {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9744) Move common arguments validation and value extraction code to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339628#comment-14339628 ] Hive QA commented on HIVE-9744: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12701215/HIVE-9744.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7570 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2889/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2889/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2889/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12701215 - PreCommit-HIVE-TRUNK-Build Move common arguments validation and value extraction code to GenericUDF Key: HIVE-9744 URL: https://issues.apache.org/jira/browse/HIVE-9744 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9744.1.patch, HIVE-9744.2.patch, HIVE-9744.3.patch most of the UDFs - check if arguments are primitive / complex - check if arguments are particular type or type_group - get converters to read values - check if argument is constant - extract arguments values Probably we should move these common methods to GenericUDF -- This message was sent by Atlassian JIRA (v6.3.4#6332)