[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487782#comment-14487782 ] Jason Dere commented on HIVE-9917: -- HIVE-3454 changed the output of a lot of tests, because the int-to-timestamp conversion behavior was changed. So this Jira is supposed to make that conversion behavior configurable, with the default config being to retain the old behavior (before HIVE-3454). If that is the case, shouldn't we see all of the test output change back to how it looked before HIVE-3454? After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10209) FetchTask with VC may fail because ExecMapper.done is true
[ https://issues.apache.org/jira/browse/HIVE-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487874#comment-14487874 ] Chao commented on HIVE-10209: - Hi [~jxiang], [~szehon]. Can any of you take a look? Thanks. FetchTask with VC may fail because ExecMapper.done is true -- Key: HIVE-10209 URL: https://issues.apache.org/jira/browse/HIVE-10209 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.1.0 Reporter: Chao Assignee: Chao Attachments: HIVE-10209.1-spark.patch, HIVE-10209.2-spark.patch ExecMapper.done is a static variable, and may cause issues in the following example: {code} set hive.fetch.task.conversion=minimal; select * from src where key 10 limit 1; set hive.fetch.task.conversion=more; select *, BLOCK__OFFSET_INSIDE__FILE from src where key 10; {code} The second select won't return any result, if running in local mode. The issue is, the first select query will be converted to a MapRedTask with only a mapper. And, when the task is done, because of the limit operator, ExecMapper.done will be set to true. Then, when the second select query begin to execute, it will call {{FetchOperator::getRecordReader()}}, and since here we have virtual column, an instance of {{HiveRecordReader}} will be returned. The problem is, {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since the value is true, it will quit immediately. In short, I think making ExecMapper.done static is a bad idea. The first query should in no way affect the second one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9645) Constant folding case NULL equality
[ https://issues.apache.org/jira/browse/HIVE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9645: --- Attachment: HIVE-9645.5.patch another minor update. Constant folding case NULL equality --- Key: HIVE-9645 URL: https://issues.apache.org/jira/browse/HIVE-9645 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.1.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-9645.1.patch, HIVE-9645.2.patch, HIVE-9645.3.patch, HIVE-9645.4.patch, HIVE-9645.5.patch, HIVE-9645.patch Hive logical optimizer does not follow the Null scan codepath when encountering a NULL = 1; NULL = 1 is not evaluated as false in the constant propogation implementation. {code} hive explain select count(1) from store_sales where null=1; ... TableScan alias: store_sales filterExpr: (null = 1) (type: boolean) Statistics: Num rows: 550076554 Data size: 49570324480 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (null = 1) (type: boolean) Statistics: Num rows: 275038277 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487736#comment-14487736 ] Chinna Rao Lalam commented on HIVE-8858: Failed test case nonmr_fetch.q is not related to this patch. This test is not enabled for the spark. RB Request : https://reviews.apache.org/r/33024/ Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6108) Introduce Cryptographic hash UDFs
[ https://issues.apache.org/jira/browse/HIVE-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487603#comment-14487603 ] Alan Gates commented on HIVE-6108: -- Ok, so this review is way overdue, but better late than never A few issues # You should consider using GenericUDF rather than UDF, see the comments in those classes on why. # When errors are encountered in the code it should write to the log rather than System.out. See many places in the Hive code for examples. # We cannot have any System.exit calls in the code. Some of this code runs in servers that cannot exit. Exceptions should be thrown instead. Introduce Cryptographic hash UDFs - Key: HIVE-6108 URL: https://issues.apache.org/jira/browse/HIVE-6108 Project: Hive Issue Type: New Feature Components: UDF Reporter: Kostiantyn Kudriavtsev Assignee: Kostiantyn Kudriavtsev Priority: Minor Attachments: Hive-6108.patch Introduce new UDF to implement Cryptographic hash algorithms: MD5 and SHA-256 which is already available in Java: MD5(string) Calculates an MD5 checksum for the string, return HEX representation SHA256(string) Calculates an SHA-256 checksum for the string, return HEX representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10268) Merge cbo branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10268: Attachment: HIVE-10268.1.patch Patch updated to latest trunk. Merge cbo branch into trunk --- Key: HIVE-10268 URL: https://issues.apache.org/jira/browse/HIVE-10268 Project: Hive Issue Type: Task Components: CBO Affects Versions: cbo-branch Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10268.1.patch, HIVE-10268.patch Merge patch generated on basis of diffs of trunk with cbo-branch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488081#comment-14488081 ] Laljo John Pullokkaran commented on HIVE-10190: --- Submit your patch, i am happy to review it. CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10265) Hive CLI crashes on != inequality
[ https://issues.apache.org/jira/browse/HIVE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487893#comment-14487893 ] Szehon Ho commented on HIVE-10265: -- I forgot to say, test failures don't look related (cannot reproduce). Hive CLI crashes on != inequality - Key: HIVE-10265 URL: https://issues.apache.org/jira/browse/HIVE-10265 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10265.patch It seems != is a supported inequality operator according to: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators]. However, HiveCLI crashes if we try a query: {noformat} hive select * from src where key != '10'; [ERROR] Could not expand event java.lang.IllegalArgumentException: != '10';: event not found at jline.console.ConsoleReader.expandEvents(ConsoleReader.java:779) at jline.console.ConsoleReader.finishBuffer(ConsoleReader.java:631) at jline.console.ConsoleReader.accept(ConsoleReader.java:2019) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2666) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:730) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} Beeline is also based on jline and does not crash. Current Hive is on jline-2.12. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487266#comment-14487266 ] Reuben commented on HIVE-10190: --- Right, which is is why in my code, I was using the {{SetT.contains}} rather than {{String.contains}}. Also, the patch as is won't work because: {code} for (Node child : current.getChildren()) { fringe.add((ASTNode) child); } {code} will fail if current doesn't have any children. Lastly, this is my first patch, so uh ... if I could maybe get some help on how to commit it .. that would be cool too : ). CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline
[ https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487944#comment-14487944 ] Vaibhav Gumashta commented on HIVE-9709: [~hsubramaniyan] It will be good to add some documentation in release notes for the new JDBC url params that have been added as part of this. Once [~leftylev] reviews, we can add it to the wiki. Hive should support replaying cookie from JDBC driver for beeline - Key: HIVE-9709 URL: https://issues.apache.org/jira/browse/HIVE-9709 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch, HIVE-9709.3.patch, HIVE-9709.4.patch, HIVE-9709.5.patch Consider the following scenario: Beeline Knox HS2. Where Knox is going to LDAP for authentication. To avoid re-authentication, Knox supports using a Cookie to identity a request. However the Beeline JDBC client does not send back the cookie Knox sent and this leads to Knox having to re-create LDAP authentication request on every connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487289#comment-14487289 ] Hive QA commented on HIVE-9580: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723937/HIVE-9580.patch {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 8666 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_varchar_mapjoin1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_decimal org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_varchar_mapjoin1 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_decimal org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.jdbc.TestSSL.testSSLFetchHttp org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3343/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3343/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3343/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723937 - PreCommit-HIVE-TRUNK-Build Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2;
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487276#comment-14487276 ] Reuben commented on HIVE-10190: --- One other thing, it looks like {{ArrayListT.remove}} has a runtime of O(N) (http://infotechgems.blogspot.com/2011/11/java-collections-performance-time.html). If we don't want to use a {{QueueT}}, maybe a {{LinkedListT}} instead? CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10266) Boolean expression True and True returns False
[ https://issues.apache.org/jira/browse/HIVE-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ckran updated HIVE-10266: - Description: A Hive query with a Boolean expression with day and month calculations that each evaluate to TRUE with use of AND evaluates to FALSE. create table datest (cntr int, date date ) row format delimited fields terminated by ',' stored as textfile ; insert into table datest values (1,'2015-04-8') ; select ((DAY('2015-05-25') - DAY(DATE)) 25), ((MONTH('2015-05-25') - MONTH(DATE)) = 1) , ((DAY('2015-05-25') - DAY(DATE)) 25) AND ((MONTH('2015-05-25') - MONTH(DATE)) = 1) from datest Returns values True | True | False was: A Hive query with a Boolean expression with day and month calculations that each evaluate to TRUE with use of AND evaluates to FALSE. create table datest (cntr int, date date ) row format delimited fields terminated by ',' stored as textfile ; insert into datest values (1,'2015-04-8') ; select ((DAY('2015-05-25') - DAY(DATE)) 25), ((MONTH('2015-05-25') - MONTH(DATE)) = 1) , ((DAY('2015-05-25') - DAY(DATE)) 25) AND ((MONTH('2015-05-25') - MONTH(DATE)) = 1) from datest Returns values True | True | False Boolean expression True and True returns False -- Key: HIVE-10266 URL: https://issues.apache.org/jira/browse/HIVE-10266 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: ckran Fix For: 0.13.0 A Hive query with a Boolean expression with day and month calculations that each evaluate to TRUE with use of AND evaluates to FALSE. create table datest (cntr int, date date ) row format delimited fields terminated by ',' stored as textfile ; insert into table datest values (1,'2015-04-8') ; select ((DAY('2015-05-25') - DAY(DATE)) 25), ((MONTH('2015-05-25') - MONTH(DATE)) = 1) , ((DAY('2015-05-25') - DAY(DATE)) 25) AND ((MONTH('2015-05-25') - MONTH(DATE)) = 1) from datest Returns values True | True | False -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10226) Column stats for Date columns not supported
[ https://issues.apache.org/jira/browse/HIVE-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487396#comment-14487396 ] Hive QA commented on HIVE-10226: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723972/HIVE-10226.4.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8666 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3344/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3344/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3344/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723972 - PreCommit-HIVE-TRUNK-Build Column stats for Date columns not supported --- Key: HIVE-10226 URL: https://issues.apache.org/jira/browse/HIVE-10226 Project: Hive Issue Type: Bug Components: Statistics Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10226.1.patch, HIVE-10226.2.patch, HIVE-10226.3.patch, HIVE-10226.4.patch {noformat} hive explain analyze table revenues compute statistics for columns; 2015-03-30 23:47:45,133 ERROR [main()]: ql.Driver (SessionState.java:printError(951)) - FAILED: UDFArgumentTypeException Only integer/long/timestamp/float/double/string/binary/boolean/decimal type argument is accepted but date is passed. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10222) Upgrade Calcite dependency to newest version
[ https://issues.apache.org/jira/browse/HIVE-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487143#comment-14487143 ] Hive QA commented on HIVE-10222: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724076/HIVE-10222.01.patch {color:red}ERROR:{color} -1 due to 31 failed/errored test(s), 8665 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_order org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_tez_joins_explain org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union29 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3342/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3342/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3342/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 31 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724076 - PreCommit-HIVE-TRUNK-Build Upgrade Calcite dependency to newest version Key: HIVE-10222 URL: https://issues.apache.org/jira/browse/HIVE-10222 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10222.01.patch, HIVE-10222.patch Upgrade Calcite version to 1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10206) Improve Alter Table to not initialize Serde unnecessarily
[ https://issues.apache.org/jira/browse/HIVE-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487549#comment-14487549 ] Hive QA commented on HIVE-10206: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723985/HIVE-10206.2.patch {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8665 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3345/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3345/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3345/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723985 - PreCommit-HIVE-TRUNK-Build Improve Alter Table to not initialize Serde unnecessarily - Key: HIVE-10206 URL: https://issues.apache.org/jira/browse/HIVE-10206 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Priority: Minor Attachments: HIVE-10206.2.patch, HIVE-10206.2.patch, HIVE-10206.patch Create an avro table with an external avsc file like: {noformat} CREATE TABLE test(...) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='file:///Users/szehon/Temp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Delete test.avsc file. Try to modify the table properties: {noformat} alter table test set tblproperties ('avro.schema.url'='file:///Users/szehon/Temp/test2.avsc'); {noformat} Will throw an exception like AvroSerdeException: {noformat} at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:119) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.determineSchemaOrReturnErrorSchema(AvroSerDe.java:163) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:101) at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:78) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:520) at
[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9917: --- Attachment: (was: HIVE-9917.patch) After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8297) Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format
[ https://issues.apache.org/jira/browse/HIVE-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487382#comment-14487382 ] Aihua Xu commented on HIVE-8297: [~hongyu.bi] What kind of data you should suppose to see in the table? Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format Key: HIVE-8297 URL: https://issues.apache.org/jira/browse/HIVE-8297 Project: Hive Issue Type: Bug Components: CLI, JDBC Affects Versions: 0.13.0 Environment: Linux Reporter: Doug Sedlak For the case: SELECT * FROM [table] JDBC direct reads the table backing data, versus cranking up a MR and creating a result set. Where table format is RCFile or ORC, incorrect results are delivered by JDBC direct read for TIMESTAMP columns. If you force a result set, correct data is returned. To reproduce using beeline: 1) Create this file as follows in HDFS. $ cat /tmp/ts.txt 2014-09-28 00:00:00 2014-09-29 00:00:00 2014-09-30 00:00:00 ctrl-D $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok: $ beeline !connect jdbc:hive2://host:port/db hive pass org.apache.hive.jdbc.HiveDriver drop table `TIMESTAMP_TEXT`; CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE; LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE `TIMESTAMP_TEXT`; select * from `TIMESTAMP_TEXT`; 3) In beeline create and load an RCFile from the TEXTFILE: drop table `TIMESTAMP_RCFILE`; CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile; INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`; 4) Demonstrate incorrect direct JDBC read versus good read by inducing result set creation: SELECT * FROM `TIMESTAMP_RCFILE`; ++ | timestamp_rcfile.ts | ++ | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL; ++ | timestamp_rcfile.ts | ++ | 2014-09-28 00:00:00.0 | | 2014-09-29 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ Note 1: The incorrect conduct demonstrated above replicates with a standalone Java/JDBC program. Note 2: Don't know if this is an issue with any other data types, also don't know what releases affected, however this occurs in Hive 13. Direct JDBC read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC wrong results are delivered, did not test any other file types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-10239: - Attachment: (was: HIVE-10239.DONOTCOMMIT.patch) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-10239: - Attachment: HIVE-10239-donotcommit.patch Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9917: --- Attachment: HIVE-9917.patch After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10246) [CBO] Table alias should be stored with Scan object, instead of Table object
[ https://issues.apache.org/jira/browse/HIVE-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487863#comment-14487863 ] Laljo John Pullokkaran commented on HIVE-10246: --- +1 [CBO] Table alias should be stored with Scan object, instead of Table object Key: HIVE-10246 URL: https://issues.apache.org/jira/browse/HIVE-10246 Project: Hive Issue Type: Improvement Components: CBO, Diagnosability, Query Planning Affects Versions: cbo-branch Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10246.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487848#comment-14487848 ] Aihua Xu commented on HIVE-9917: For those test cases, I changed them to set the configuration to true so we are testing the new behavior since that's what we will keep later and added new test cases to test the old behavior. That's why the test baselines didn't change in this changelist. After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3378) UDF to obtain the numeric day of an year from date or timestamp in HIVE.
[ https://issues.apache.org/jira/browse/HIVE-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488228#comment-14488228 ] Swarnim Kulkarni commented on HIVE-3378: [~apivovarov] Left a few comments on the review. In my opinion, the implementation feels a little complicated. First of all, may be I am missing something, but why do we need to extend a GenericUDF and not a simple UDF? Can't we have an implementation similar to the dayofmonth[1]? [1] https://github.com/cloudera/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDayOfMonth.java UDF to obtain the numeric day of an year from date or timestamp in HIVE. Key: HIVE-3378 URL: https://issues.apache.org/jira/browse/HIVE-3378 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.8.1, 0.9.0 Reporter: Deepti Antony Assignee: Alexander Pivovarov Attachments: HIVE-3378.02.patch, HIVE-3378.02.patch, HIVE-3378.1.patch.txt Hive current releases lacks a function which returns the numeric day of an year if a date or timestamp is given .The function DAYOFYEAR(date) would return the numeric day from a date / timestamp or which would be useful while using HiveQL.DAYOFYEAR can be used to compare data with respect to number of days till the given date.It can be used in different domains. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10016) Remove duplicated Hive table schema parsing in DataWritableReadSupport
[ https://issues.apache.org/jira/browse/HIVE-10016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486835#comment-14486835 ] Dong Chen commented on HIVE-10016: -- The failed test is not related. The patch is rebased to trunk and is ready to go. Remove duplicated Hive table schema parsing in DataWritableReadSupport -- Key: HIVE-10016 URL: https://issues.apache.org/jira/browse/HIVE-10016 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10016-parquet.patch, HIVE-10016.1-parquet.patch, HIVE-10016.patch In {{DataWritableReadSupport.init()}}, the table schema is created and its string format is set in conf. When construct the {{ParquetRecordReaderWrapper}} , the schema is fetched from conf and parsed several times. We could remove these schema parsing, and improve the speed of getRecordReader a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore
[ https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan reassigned HIVE-4625: --- Assignee: Hari Sankar Sivarama Subramaniyan (was: Abdelrahman Shettia) HS2 should not attempt to get delegation token from metastore if using embedded metastore - Key: HIVE-4625 URL: https://issues.apache.org/jira/browse/HIVE-4625 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan In kerberos secure mode, with doas enabled, Hive server2 tries to get delegation token from metastore even if the metastore is being used in embedded mode. To avoid failure in that case, it uses catch block for UnsupportedOperationException thrown that does nothing. But this leads to an error being logged by lower levels and can mislead users into thinking that there is a problem. It should check if delegation token mode is supported with current configuration before calling the function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10189) Create a micro benchmark tool for vectorization to evaluate the performance gain after SIMD optimization
[ https://issues.apache.org/jira/browse/HIVE-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486916#comment-14486916 ] Hive QA commented on HIVE-10189: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723880/HIVE-10189.2.patch {color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 8665 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
[jira] [Resolved] (HIVE-8297) Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format
[ https://issues.apache.org/jira/browse/HIVE-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu resolved HIVE-8297. Resolution: Duplicate Fix Version/s: 0.14.0 I traced into this issue and it has been fixed in the later version under HIVE-7399. The issue is that we are referring to the same timestamp object and then if we are updating the object, actually all the timestamps get affected since they are pointing to the same one. Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format Key: HIVE-8297 URL: https://issues.apache.org/jira/browse/HIVE-8297 Project: Hive Issue Type: Bug Components: CLI, JDBC Affects Versions: 0.13.0 Environment: Linux Reporter: Doug Sedlak Assignee: Aihua Xu Fix For: 0.14.0 For the case: SELECT * FROM [table] JDBC direct reads the table backing data, versus cranking up a MR and creating a result set. Where table format is RCFile or ORC, incorrect results are delivered by JDBC direct read for TIMESTAMP columns. If you force a result set, correct data is returned. To reproduce using beeline: 1) Create this file as follows in HDFS. $ cat /tmp/ts.txt 2014-09-28 00:00:00 2014-09-29 00:00:00 2014-09-30 00:00:00 ctrl-D $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok: $ beeline !connect jdbc:hive2://host:port/db hive pass org.apache.hive.jdbc.HiveDriver drop table `TIMESTAMP_TEXT`; CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE; LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE `TIMESTAMP_TEXT`; select * from `TIMESTAMP_TEXT`; 3) In beeline create and load an RCFile from the TEXTFILE: drop table `TIMESTAMP_RCFILE`; CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile; INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`; 4) Demonstrate incorrect direct JDBC read versus good read by inducing result set creation: SELECT * FROM `TIMESTAMP_RCFILE`; ++ | timestamp_rcfile.ts | ++ | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL; ++ | timestamp_rcfile.ts | ++ | 2014-09-28 00:00:00.0 | | 2014-09-29 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ Note 1: The incorrect conduct demonstrated above replicates with a standalone Java/JDBC program. Note 2: Don't know if this is an issue with any other data types, also don't know what releases affected, however this occurs in Hive 13. Direct JDBC read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC wrong results are delivered, did not test any other file types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488076#comment-14488076 ] Laljo John Pullokkaran commented on HIVE-10190: --- [~Reuben] No need for apologies; i haven't started looking at this. CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline
[ https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9709: --- Affects Version/s: 1.2.0 Hive should support replaying cookie from JDBC driver for beeline - Key: HIVE-9709 URL: https://issues.apache.org/jira/browse/HIVE-9709 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch, HIVE-9709.3.patch, HIVE-9709.4.patch, HIVE-9709.5.patch Consider the following scenario: Beeline Knox HS2. Where Knox is going to LDAP for authentication. To avoid re-authentication, Knox supports using a Cookie to identity a request. However the Beeline JDBC client does not send back the cookie Knox sent and this leads to Knox having to re-create LDAP authentication request on every connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline
[ https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9709: --- Fix Version/s: 1.2.0 Hive should support replaying cookie from JDBC driver for beeline - Key: HIVE-9709 URL: https://issues.apache.org/jira/browse/HIVE-9709 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch, HIVE-9709.3.patch, HIVE-9709.4.patch, HIVE-9709.5.patch Consider the following scenario: Beeline Knox HS2. Where Knox is going to LDAP for authentication. To avoid re-authentication, Knox supports using a Cookie to identity a request. However the Beeline JDBC client does not send back the cookie Knox sent and this leads to Knox having to re-create LDAP authentication request on every connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline
[ https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9709: --- Component/s: JDBC HiveServer2 Hive should support replaying cookie from JDBC driver for beeline - Key: HIVE-9709 URL: https://issues.apache.org/jira/browse/HIVE-9709 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch, HIVE-9709.3.patch, HIVE-9709.4.patch, HIVE-9709.5.patch Consider the following scenario: Beeline Knox HS2. Where Knox is going to LDAP for authentication. To avoid re-authentication, Knox supports using a Cookie to identity a request. However the Beeline JDBC client does not send back the cookie Knox sent and this leads to Knox having to re-create LDAP authentication request on every connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9580: --- Attachment: (was: HIVE-9580.patch) Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu Attachments: HIVE-9580.patch The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1B_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql);
[jira] [Updated] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-8858: --- Attachment: HIVE-8858.3-spark.patch Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10209) FetchTask with VC may fail because ExecMapper.done is true
[ https://issues.apache.org/jira/browse/HIVE-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487895#comment-14487895 ] Szehon Ho commented on HIVE-10209: -- If I understand, this is for MR local mode. +1 FetchTask with VC may fail because ExecMapper.done is true -- Key: HIVE-10209 URL: https://issues.apache.org/jira/browse/HIVE-10209 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.1.0 Reporter: Chao Assignee: Chao Attachments: HIVE-10209.1-spark.patch, HIVE-10209.2-spark.patch ExecMapper.done is a static variable, and may cause issues in the following example: {code} set hive.fetch.task.conversion=minimal; select * from src where key 10 limit 1; set hive.fetch.task.conversion=more; select *, BLOCK__OFFSET_INSIDE__FILE from src where key 10; {code} The second select won't return any result, if running in local mode. The issue is, the first select query will be converted to a MapRedTask with only a mapper. And, when the task is done, because of the limit operator, ExecMapper.done will be set to true. Then, when the second select query begin to execute, it will call {{FetchOperator::getRecordReader()}}, and since here we have virtual column, an instance of {{HiveRecordReader}} will be returned. The problem is, {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since the value is true, it will quit immediately. In short, I think making ExecMapper.done static is a bad idea. The first query should in no way affect the second one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10265) Hive CLI crashes on != inequality
[ https://issues.apache.org/jira/browse/HIVE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487869#comment-14487869 ] Jimmy Xiang commented on HIVE-10265: +1 Hive CLI crashes on != inequality - Key: HIVE-10265 URL: https://issues.apache.org/jira/browse/HIVE-10265 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10265.patch It seems != is a supported inequality operator according to: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators]. However, HiveCLI crashes if we try a query: {noformat} hive select * from src where key != '10'; [ERROR] Could not expand event java.lang.IllegalArgumentException: != '10';: event not found at jline.console.ConsoleReader.expandEvents(ConsoleReader.java:779) at jline.console.ConsoleReader.finishBuffer(ConsoleReader.java:631) at jline.console.ConsoleReader.accept(ConsoleReader.java:2019) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2666) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:730) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} Beeline is also based on jline and does not crash. Current Hive is on jline-2.12. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3299) Create UDF DAYNAME(date)
[ https://issues.apache.org/jira/browse/HIVE-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-3299: -- Attachment: HIVE-3299.5.patch patch #5 - fixed show_functions.q.out Create UDF DAYNAME(date) - Key: HIVE-3299 URL: https://issues.apache.org/jira/browse/HIVE-3299 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Namitha Babychan Assignee: Alexander Pivovarov Labels: patch Attachments: HIVE-3299.1.patch.txt, HIVE-3299.2.patch, HIVE-3299.3.patch, HIVE-3299.4.patch, HIVE-3299.5.patch, HIVE-3299.patch.txt, Hive-3299_Testcase.doc, udf_dayname.q, udf_dayname.q.out dayname(date/timestamp/string) Returns the name of the weekday for date. The language used for the name is English. select dayname('2015-04-08'); Wednesday -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10273) Union with partition tables which have no data fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10273: -- Attachment: HIVE-10273.1.patch Union with partition tables which have no data fails with NPE - Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10273.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3194) implement the JDBC canonical/ISO-SQL 2011 scalar functions
[ https://issues.apache.org/jira/browse/HIVE-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487987#comment-14487987 ] Hive QA commented on HIVE-3194: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12600271/Hive-3194.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3348/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3348/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3348/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3348/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/test/results/clientpositive/show_functions.q.out' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/scheduler/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/thirdparty itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target itests/util/target itests/qtest-spark/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target accumulo-handler/target hwi/target common/target common/src/gen spark-client/target contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/udf_dayname.q.out ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFDayName.java ql/src/test/queries/clientpositive/udf_dayname.q ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDayName.java + svn update Upom.xml Ujdbc/src/java/org/apache/hive/jdbc/HiveConnection.java Ujdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java Ujdbc/src/java/org/apache/hive/jdbc/Utils.java Ujdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java Ubin/beeline U itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestThriftHttpCLIService.java Uitests/hive-unit/src/test/java/org/apache/hive/jdbc/TestSSL.java Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1672451. Updated to revision 1672451. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12600271 - PreCommit-HIVE-TRUNK-Build implement the JDBC canonical/ISO-SQL 2011 scalar functions -- Key: HIVE-3194 URL: https://issues.apache.org/jira/browse/HIVE-3194
[jira] [Commented] (HIVE-10269) HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor
[ https://issues.apache.org/jira/browse/HIVE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487784#comment-14487784 ] Vaibhav Gumashta commented on HIVE-10269: - +1 HiveMetaStore.java:[6089,29] cannot find symbol class JvmPauseMonitor - Key: HIVE-10269 URL: https://issues.apache.org/jira/browse/HIVE-10269 Project: Hive Issue Type: Bug Components: Metastore Reporter: Gabor Liptak Assignee: Ferdinand Xu Attachments: HIVE-10269.patch Compiling trunk fails when building based on instructions in https://cwiki.apache.org/confluence/display/Hive/HowToContribute $ git status On branch trunk Your branch is up-to-date with 'origin/trunk'. nothing to commit, working directory clean $ mvn clean install -DskipTests -Phadoop-1 ...[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-metastore: Compilation failure: Compilation failure: [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6089,29] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] /tmp/hive/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[6090,35] cannot find symbol [ERROR] symbol: class JvmPauseMonitor [ERROR] location: package org.apache.hadoop.util [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hive-metastore -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-10239: - Attachment: (was: HIVE-10239-DONOTCOMMIT.patch) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239.DONOTCOMMIT.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10148) update of bucking column should not be allowed
[ https://issues.apache.org/jira/browse/HIVE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10148: -- Attachment: HIVE-10148.5.patch update of bucking column should not be allowed -- Key: HIVE-10148 URL: https://issues.apache.org/jira/browse/HIVE-10148 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.1.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10148.2.patch, HIVE-10148.3.patch, HIVE-10148.4.patch, HIVE-10148.5.patch, HIVE-10148.patch update tbl set a = 5; should raise an error if 'a' is a bucketing column. Such operation is not supported but currently not checked for. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10265) Hive CLI crashes on != inequality
[ https://issues.apache.org/jira/browse/HIVE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487852#comment-14487852 ] Szehon Ho commented on HIVE-10265: -- [~csun], [~jxiang] can one of you guys take a look? Hive CLI crashes on != inequality - Key: HIVE-10265 URL: https://issues.apache.org/jira/browse/HIVE-10265 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10265.patch It seems != is a supported inequality operator according to: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators]. However, HiveCLI crashes if we try a query: {noformat} hive select * from src where key != '10'; [ERROR] Could not expand event java.lang.IllegalArgumentException: != '10';: event not found at jline.console.ConsoleReader.expandEvents(ConsoleReader.java:779) at jline.console.ConsoleReader.finishBuffer(ConsoleReader.java:631) at jline.console.ConsoleReader.accept(ConsoleReader.java:2019) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2666) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:730) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} Beeline is also based on jline and does not crash. Current Hive is on jline-2.12. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10276) Implement date_format(timestamp, fmt) UDF
[ https://issues.apache.org/jira/browse/HIVE-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-10276: --- Attachment: HIVE-10276.01.patch patch #01 Implement date_format(timestamp, fmt) UDF - Key: HIVE-10276 URL: https://issues.apache.org/jira/browse/HIVE-10276 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-10276.01.patch date_format(date/timestamp/string, fmt) converts a date/timestamp/string to a value of String in the format specified by the java date format fmt. Supported formats listed here: https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10252) Make PPD work for Parquet in row group level
[ https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487973#comment-14487973 ] Szehon Ho commented on HIVE-10252: -- New strategy makes sense to me. I have some question though, we already call ParquetInputFormat.setFilterPredicate() earlier. Is that the strategy that you mention doesnt work, and maybe we can get rid of it? Or is it doing something else? Sorry as I am not too familiar with that code, thanks. Make PPD work for Parquet in row group level Key: HIVE-10252 URL: https://issues.apache.org/jira/browse/HIVE-10252 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10252.patch In Hive, predicate pushdown figures out the search condition in HQL, serialize it, and push to file format. ORC could use the predicate to filter stripes. Similarly, Parquet should use the statics saved in row group to filter not match row group. But it does not work. In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client side), and push the filter to Parquet for further processing (parquet side). But in {{ParquetRecordReader.initializeInternalReader()}}, if the splits have already been selected by client side, it will not handle filter again. We should make the behavior consistent in Hive. Maybe we could get splits, filter them, and then pass to parquet. This means using client side strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml
[ https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487765#comment-14487765 ] Sushanth Sowmyan commented on HIVE-10251: - [~thejas] : Yup, that's exactly what I was suggesting - since we're the ones doing the dependency resolution, we can look for ivysettings.xml in conf dir and then classpath, and if not found, fall back to hive-ivysettings-default.xml, which we can package in our jars. HIVE-9664 makes hive depend on ivysettings.xml -- Key: HIVE-10251 URL: https://issues.apache.org/jira/browse/HIVE-10251 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Anant Nag Labels: patch Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, HIVE-10251.simple.patch HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is not present, it makes hive NPE when instantiating a CLISessionState. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.DependencyResolver.init(DependencyResolver.java:61) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:343) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:334) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:60) {noformat} This happens because of the following bit: {noformat} // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf then load default ivysettings.xml from class loader if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) { ivysettingsPath = ClassLoader.getSystemResource(ivysettings.xml).getFile(); _console.printInfo(ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR, + ivysettingsPath + will be used); } {noformat} This makes it so that an attempt to instantiate CliSessionState without an ivysettings.xml file will cause hive to fail with an NPE. Hive should not have a hard dependency on a ivysettings,xml being present, and this feature should gracefully fail in that case instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10209) FetchTask with VC may fail because ExecMapper.done is true
[ https://issues.apache.org/jira/browse/HIVE-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488018#comment-14488018 ] Jimmy Xiang commented on HIVE-10209: +1 FetchTask with VC may fail because ExecMapper.done is true -- Key: HIVE-10209 URL: https://issues.apache.org/jira/browse/HIVE-10209 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.1.0 Reporter: Chao Assignee: Chao Attachments: HIVE-10209.1-spark.patch, HIVE-10209.2-spark.patch ExecMapper.done is a static variable, and may cause issues in the following example: {code} set hive.fetch.task.conversion=minimal; select * from src where key 10 limit 1; set hive.fetch.task.conversion=more; select *, BLOCK__OFFSET_INSIDE__FILE from src where key 10; {code} The second select won't return any result, if running in local mode. The issue is, the first select query will be converted to a MapRedTask with only a mapper. And, when the task is done, because of the limit operator, ExecMapper.done will be set to true. Then, when the second select query begin to execute, it will call {{FetchOperator::getRecordReader()}}, and since here we have virtual column, an instance of {{HiveRecordReader}} will be returned. The problem is, {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since the value is true, it will quit immediately. In short, I think making ExecMapper.done static is a bad idea. The first query should in no way affect the second one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10191) ORC: Cleanup writer per-row synchronization
[ https://issues.apache.org/jira/browse/HIVE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487028#comment-14487028 ] Hive QA commented on HIVE-10191: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723873/HIVE-10191.2.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8665 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3341/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3341/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3341/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723873 - PreCommit-HIVE-TRUNK-Build ORC: Cleanup writer per-row synchronization --- Key: HIVE-10191 URL: https://issues.apache.org/jira/browse/HIVE-10191 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-10191.1.patch, HIVE-10191.2.patch ORC writers were originally meant to be thread-safe, but in the present day implementation each ORC writer is entirely share-nothing which converts most of the synchronized blocks in ORC as entirely uncontested locks. These uncontested locks prevent the JVM from inlining/optimizing these methods, while adding no extra thread-safety to the ORC writers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10252) Make PPD work for Parquet in row group level
[ https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486974#comment-14486974 ] Dong Chen commented on HIVE-10252: -- Hi, [~spena], [~szehon], could you please help to review this? Thanks! Make PPD work for Parquet in row group level Key: HIVE-10252 URL: https://issues.apache.org/jira/browse/HIVE-10252 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10252.patch In Hive, predicate pushdown figures out the search condition in HQL, serialize it, and push to file format. ORC could use the predicate to filter stripes. Similarly, Parquet should use the statics saved in row group to filter not match row group. But it does not work. In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client side), and push the filter to Parquet for further processing (parquet side). But in {{ParquetRecordReader.initializeInternalReader()}}, if the splits have already been selected by client side, it will not handle filter again. We should make the behavior consistent in Hive. Maybe we could get splits, filter them, and then pass to parquet. This means using client side strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml
[ https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anant Nag updated HIVE-10251: - Attachment: HIVE-10251.2.patch Latest patch after addressing comments in the rb. HIVE-9664 makes hive depend on ivysettings.xml -- Key: HIVE-10251 URL: https://issues.apache.org/jira/browse/HIVE-10251 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Anant Nag Labels: patch Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, HIVE-10251.simple.patch HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is not present, it makes hive NPE when instantiating a CLISessionState. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.DependencyResolver.init(DependencyResolver.java:61) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:343) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:334) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:60) {noformat} This happens because of the following bit: {noformat} // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf then load default ivysettings.xml from class loader if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) { ivysettingsPath = ClassLoader.getSystemResource(ivysettings.xml).getFile(); _console.printInfo(ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR, + ivysettingsPath + will be used); } {noformat} This makes it so that an attempt to instantiate CliSessionState without an ivysettings.xml file will cause hive to fail with an NPE. Hive should not have a hard dependency on a ivysettings,xml being present, and this feature should gracefully fail in that case instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10252) Make PPD work for Parquet in row group level
[ https://issues.apache.org/jira/browse/HIVE-10252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486792#comment-14486792 ] Hive QA commented on HIVE-10252: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723834/HIVE-10252.patch {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 8666 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3338/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3338/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3338/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723834 - PreCommit-HIVE-TRUNK-Build Make PPD work for Parquet in row group level Key: HIVE-10252 URL: https://issues.apache.org/jira/browse/HIVE-10252 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10252.patch In Hive, predicate pushdown figures out the search condition in HQL, serialize it, and push to file format. ORC could use the predicate to filter stripes. Similarly, Parquet should use the statics saved in row group to filter not match row group. But it does not work. In {{ParquetRecordReaderWrapper}}, it get splits with all row groups (client side), and push the filter to Parquet for further processing (parquet side). But in {{ParquetRecordReader.initializeInternalReader()}}, if the splits have already been selected by client side, it will not handle filter again. We should make the behavior consistent in Hive. Maybe we could get splits, filter them, and then pass to parquet. This means using client side strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8297) Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format
[ https://issues.apache.org/jira/browse/HIVE-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487455#comment-14487455 ] Aihua Xu commented on HIVE-8297: I saw the issue on 0.13. Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format Key: HIVE-8297 URL: https://issues.apache.org/jira/browse/HIVE-8297 Project: Hive Issue Type: Bug Components: CLI, JDBC Affects Versions: 0.13.0 Environment: Linux Reporter: Doug Sedlak Assignee: Aihua Xu For the case: SELECT * FROM [table] JDBC direct reads the table backing data, versus cranking up a MR and creating a result set. Where table format is RCFile or ORC, incorrect results are delivered by JDBC direct read for TIMESTAMP columns. If you force a result set, correct data is returned. To reproduce using beeline: 1) Create this file as follows in HDFS. $ cat /tmp/ts.txt 2014-09-28 00:00:00 2014-09-29 00:00:00 2014-09-30 00:00:00 ctrl-D $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok: $ beeline !connect jdbc:hive2://host:port/db hive pass org.apache.hive.jdbc.HiveDriver drop table `TIMESTAMP_TEXT`; CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE; LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE `TIMESTAMP_TEXT`; select * from `TIMESTAMP_TEXT`; 3) In beeline create and load an RCFile from the TEXTFILE: drop table `TIMESTAMP_RCFILE`; CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile; INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`; 4) Demonstrate incorrect direct JDBC read versus good read by inducing result set creation: SELECT * FROM `TIMESTAMP_RCFILE`; ++ | timestamp_rcfile.ts | ++ | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL; ++ | timestamp_rcfile.ts | ++ | 2014-09-28 00:00:00.0 | | 2014-09-29 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ Note 1: The incorrect conduct demonstrated above replicates with a standalone Java/JDBC program. Note 2: Don't know if this is an issue with any other data types, also don't know what releases affected, however this occurs in Hive 13. Direct JDBC read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC wrong results are delivered, did not test any other file types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10285) Incorrect endFunction call in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10285: --- Component/s: Metastore Incorrect endFunction call in HiveMetaStore --- Key: HIVE-10285 URL: https://issues.apache.org/jira/browse/HIVE-10285 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Nezih Yigitbasi Priority: Minor Attachments: HIVE-10285.patch The HiveMetaStore.get_function() method ends with an incorrect call to the endFunction() method. Instead of: {code} endFunction(get_database, func != null, ex); {code} It should call: {code} endFunction(get_function, func != null, ex); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10263) CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional
[ https://issues.apache.org/jira/browse/HIVE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488259#comment-14488259 ] Laljo John Pullokkaran commented on HIVE-10263: --- Checked in to CBO branch. CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional --- Key: HIVE-10263 URL: https://issues.apache.org/jira/browse/HIVE-10263 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10263.01.cbo.patch, HIVE-10263.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10284) enable container reuse for grace hash join
[ https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-10284: -- Attachment: HIVE-10284.1.patch enable container reuse for grace hash join --- Key: HIVE-10284 URL: https://issues.apache.org/jira/browse/HIVE-10284 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10284.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487917#comment-14487917 ] Jason Dere commented on HIVE-9917: -- Ah ok. Actually, I would prefer to have the output back to the way it was pre-HIVE-3454, to confirm that between HIVE-3454 and this Jira, we haven't broken anything in the existing tests. And then have the new tests toggle hive.int.timestamp.conversion.in.second to true/false so we can see the timestamp conversion behavior change. After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-8858: --- Attachment: HIVE-8858.4-spark.patch Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10226) Column stats for Date columns not supported
[ https://issues.apache.org/jira/browse/HIVE-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487503#comment-14487503 ] Swarnim Kulkarni commented on HIVE-10226: - +1. Looks great. Thanks for making the changes. Column stats for Date columns not supported --- Key: HIVE-10226 URL: https://issues.apache.org/jira/browse/HIVE-10226 Project: Hive Issue Type: Bug Components: Statistics Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10226.1.patch, HIVE-10226.2.patch, HIVE-10226.3.patch, HIVE-10226.4.patch {noformat} hive explain analyze table revenues compute statistics for columns; 2015-03-30 23:47:45,133 ERROR [main()]: ql.Driver (SessionState.java:printError(951)) - FAILED: UDFArgumentTypeException Only integer/long/timestamp/float/double/string/binary/boolean/decimal type argument is accepted but date is passed. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488144#comment-14488144 ] Reuben commented on HIVE-10190: --- Okay, I think it's up. I really appreciate the help! CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, HIVE-10190.02.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10222) Upgrade Calcite dependency to newest version
[ https://issues.apache.org/jira/browse/HIVE-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487641#comment-14487641 ] Ashutosh Chauhan commented on HIVE-10222: - Seems like test {{index_auto_mult_tables.q}} failed because of generating different plan which led to wrong results. Upgrade Calcite dependency to newest version Key: HIVE-10222 URL: https://issues.apache.org/jira/browse/HIVE-10222 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10222.01.patch, HIVE-10222.patch Upgrade Calcite version to 1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10285) Incorrect endFunction call in HiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10285: --- Attachment: HIVE-10285.patch Incorrect endFunction call in HiveMetaStore --- Key: HIVE-10285 URL: https://issues.apache.org/jira/browse/HIVE-10285 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Nezih Yigitbasi Priority: Minor Attachments: HIVE-10285.patch The HiveMetaStore.get_function() method ends with an incorrect call to the endFunction() method. Instead of: {code} endFunction(get_database, func != null, ex); {code} It should call: {code} endFunction(get_function, func != null, ex); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9710: --- Affects Version/s: 1.2.0 HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9710: --- Fix Version/s: 1.2.0 HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9710: --- Component/s: HiveServer2 HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9710: --- Labels: TODOC1.2 (was: ) HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-9710: --- Issue Type: Improvement (was: Bug) HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0
[ https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaveen Raajan updated HIVE-10277: - Description: I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} was: I tried to use comment line (*--*) in HIVE-1.1.0 command shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} Unable to process Comment line '--' in HIVE-1.1.0 - Key: HIVE-10277 URL: https://issues.apache.org/jira/browse/HIVE-10277 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.0.0 Reporter: Kaveen Raajan Priority: Minor Labels: hive I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at
[jira] [Commented] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml
[ https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487939#comment-14487939 ] Thejas M Nair commented on HIVE-10251: -- bq. Yup, that's exactly what I was suggesting Sorry, missed that last line your earlier comment! HIVE-9664 makes hive depend on ivysettings.xml -- Key: HIVE-10251 URL: https://issues.apache.org/jira/browse/HIVE-10251 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Anant Nag Labels: patch Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, HIVE-10251.simple.patch HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is not present, it makes hive NPE when instantiating a CLISessionState. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.DependencyResolver.init(DependencyResolver.java:61) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:343) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:334) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:60) {noformat} This happens because of the following bit: {noformat} // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf then load default ivysettings.xml from class loader if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) { ivysettingsPath = ClassLoader.getSystemResource(ivysettings.xml).getFile(); _console.printInfo(ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR, + ivysettingsPath + will be used); } {noformat} This makes it so that an attempt to instantiate CliSessionState without an ivysettings.xml file will cause hive to fail with an NPE. Hive should not have a hard dependency on a ivysettings,xml being present, and this feature should gracefully fail in that case instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10265) Hive CLI crashes on != inequality
[ https://issues.apache.org/jira/browse/HIVE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487856#comment-14487856 ] Chao commented on HIVE-10265: - +1 Hive CLI crashes on != inequality - Key: HIVE-10265 URL: https://issues.apache.org/jira/browse/HIVE-10265 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10265.patch It seems != is a supported inequality operator according to: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators]. However, HiveCLI crashes if we try a query: {noformat} hive select * from src where key != '10'; [ERROR] Could not expand event java.lang.IllegalArgumentException: != '10';: event not found at jline.console.ConsoleReader.expandEvents(ConsoleReader.java:779) at jline.console.ConsoleReader.finishBuffer(ConsoleReader.java:631) at jline.console.ConsoleReader.accept(ConsoleReader.java:2019) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2666) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:730) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} Beeline is also based on jline and does not crash. Current Hive is on jline-2.12. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9558) [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486839#comment-14486839 ] Dong Chen commented on HIVE-9558: - Thanks for your review and comments! [~spena] Failed tests seem not related. I verified the last one locally and passed. [Parquet] support HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable in vectorized mode --- Key: HIVE-9558 URL: https://issues.apache.org/jira/browse/HIVE-9558 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9558.1.patch, HIVE-9558.2.patch, HIVE-9558.patch When using Parquet in vectorized mode, {{VectorColumnAssignFactory.buildAssigners(..)}} does not handle HiveDecimalWritable, HiveCharWritable, HiveVarcharWritable. We need fix this and add test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10077) Use new ParquetInputSplit constructor API
[ https://issues.apache.org/jira/browse/HIVE-10077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486838#comment-14486838 ] Ferdinand Xu commented on HIVE-10077: - The failed case is caused by spark. Spark is still using rc3 version and the new ParquetInputSplit API is not supported in that version. Will wait for the spark's update for the parquet-column module version. Use new ParquetInputSplit constructor API - Key: HIVE-10077 URL: https://issues.apache.org/jira/browse/HIVE-10077 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10077.1.patch, HIVE-10077.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488092#comment-14488092 ] Hive QA commented on HIVE-8858: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724274/HIVE-8858.4-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8710 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/824/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/824/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-824/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724274 - PreCommit-HIVE-SPARK-Build Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10276) Implement date_format(timestamp, fmt) UDF
[ https://issues.apache.org/jira/browse/HIVE-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488031#comment-14488031 ] Alexander Pivovarov commented on HIVE-10276: corection {code} weekofyear(date) // can be replaced with cast(date_format(date, 'w') as int) {code} Implement date_format(timestamp, fmt) UDF - Key: HIVE-10276 URL: https://issues.apache.org/jira/browse/HIVE-10276 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-10276.01.patch date_format(date/timestamp/string, fmt) converts a date/timestamp/string to a value of String in the format specified by the java date format fmt. Supported formats listed here: https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-8858: --- Attachment: HIVE-8858.2-spark.patch Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-10239: - Attachment: HIVE-10239.0.patch Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml
[ https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487750#comment-14487750 ] Thejas M Nair commented on HIVE-10251: -- (Forgive me if this idea idea is stupid, because of my limited knowledge of ivy). Is it possible do do something like hadoop's *defualt.xml vs *site.xml ? Ie, package something like ivysettings-hivedefault.xml as the default file in jar and then use that only if there is no ivysettings.xml in path ? (or iveysettings-hivesite.xml). Would that address the concerns [~sushanth] ? HIVE-9664 makes hive depend on ivysettings.xml -- Key: HIVE-10251 URL: https://issues.apache.org/jira/browse/HIVE-10251 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Anant Nag Labels: patch Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, HIVE-10251.simple.patch HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is not present, it makes hive NPE when instantiating a CLISessionState. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.DependencyResolver.init(DependencyResolver.java:61) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:343) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:334) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:60) {noformat} This happens because of the following bit: {noformat} // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf then load default ivysettings.xml from class loader if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) { ivysettingsPath = ClassLoader.getSystemResource(ivysettings.xml).getFile(); _console.printInfo(ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR, + ivysettingsPath + will be used); } {noformat} This makes it so that an attempt to instantiate CliSessionState without an ivysettings.xml file will cause hive to fail with an NPE. Hive should not have a hard dependency on a ivysettings,xml being present, and this feature should gracefully fail in that case instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10273) Union with partition tables which have no data fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10273: -- Description: As shown in the test case in the patch below, when we have partitioned tables which have no data, we fail with an NPE with the following stack trace: {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateMapWork(Vectorizer.java:357) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:321) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:307) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:847) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:468) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:223) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) {code} Union with partition tables which have no data fails with NPE - Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10273.1.patch As shown in the test case in the patch below, when we have partitioned tables which have no data, we fail with an NPE with the following stack trace: {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateMapWork(Vectorizer.java:357) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:321) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:307) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194) at org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139) at org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:847) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:468) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:223) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10266) Boolean expression True and True returns False
[ https://issues.apache.org/jira/browse/HIVE-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487657#comment-14487657 ] ckran commented on HIVE-10266: -- Thanks. That workaround worked. Boolean expression True and True returns False -- Key: HIVE-10266 URL: https://issues.apache.org/jira/browse/HIVE-10266 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: ckran Fix For: 0.13.0 A Hive query with a Boolean expression with day and month calculations that each evaluate to TRUE with use of AND evaluates to FALSE. create table datest (cntr int, date date ) row format delimited fields terminated by ',' stored as textfile ; insert into table datest values (1,'2015-04-8') ; select ((DAY('2015-05-25') - DAY(DATE)) 25), ((MONTH('2015-05-25') - MONTH(DATE)) = 1) , ((DAY('2015-05-25') - DAY(DATE)) 25) AND ((MONTH('2015-05-25') - MONTH(DATE)) = 1) from datest Returns values True | True | False -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10246) [CBO] Table alias should be stored with Scan object, instead of Table object
[ https://issues.apache.org/jira/browse/HIVE-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10246. - Resolution: Fixed Fix Version/s: cbo-branch Committed to branch [CBO] Table alias should be stored with Scan object, instead of Table object Key: HIVE-10246 URL: https://issues.apache.org/jira/browse/HIVE-10246 Project: Hive Issue Type: Improvement Components: CBO, Diagnosability, Query Planning Affects Versions: cbo-branch Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: cbo-branch Attachments: HIVE-10246.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487853#comment-14487853 ] Hive QA commented on HIVE-8858: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724262/HIVE-8858.3-spark.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8710 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/823/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/823/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-823/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724262 - PreCommit-HIVE-SPARK-Build Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10271) remove hive.server2.thrift.http.min/max.worker.threads properties
[ https://issues.apache.org/jira/browse/HIVE-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488141#comment-14488141 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10271: -- [~leftylev] Thanks for pointing this out. I looked at the code base and these parameters were used until HIVE-7935 was checked in. So we should rather mark them as deprecated since 0.14.0 than removing them all together. Thanks Hari remove hive.server2.thrift.http.min/max.worker.threads properties - Key: HIVE-10271 URL: https://issues.apache.org/jira/browse/HIVE-10271 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10271.1.patch PROBLEM: Those properties are not used even when hiveserver2 in http mode. The properties used are hive.server2.thrift.min/max.worker.threads. Remove those 2 properties as they are causing confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9647) Discrepancy in cardinality estimates between partitioned and un-partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488189#comment-14488189 ] Ashutosh Chauhan commented on HIVE-9647: +1 Discrepancy in cardinality estimates between partitioned and un-partitioned tables --- Key: HIVE-9647 URL: https://issues.apache.org/jira/browse/HIVE-9647 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-9647.01.patch, HIVE-9647.02.patch, HIVE-9647.03.patch High-level summary HiveRelMdSelectivity.computeInnerJoinSelectivity relies on per column number of distinct value to estimate join selectivity. The way statistics are aggregated for partitioned tables results in discrepancy in number of distinct values which results in different plans between partitioned and un-partitioned schemas. The table below summarizes the NDVs in computeInnerJoinSelectivity which are used to estimate selectivity of joins. ||Column ||Partitioned count distincts|| Un-Partitioned count distincts |sr_customer_sk |71,245 |1,415,625| |sr_item_sk |38,846|62,562| |sr_ticket_number |71,245 |34,931,085| |ss_customer_sk |88,476|1,415,625| |ss_item_sk |38,846|62,562| |ss_ticket_number|100,756 |56,256,175| The discrepancy is because NDV calculation for a partitioned table assumes that the NDV range is contained within each partition and is calculates as select max(NUM_DISTINCTS) from PART_COL_STATS” . This is problematic for columns like ticket number which are naturally increasing with the partitioned date column ss_sold_date_sk. Suggestions Use Hyper Log Log as suggested by Gopal, there is an HLL implementation for HBASE co-porccessors which we can use as a reference here Using the global stats from TAB_COL_STATS and the per partition stats from PART_COL_STATS extrapolate the NDV for the qualified partitions as in : Max ( (NUM_DISTINCTS from TAB_COL_STATS) x (Number of qualified partitions) / (Number of Partitions), max(NUM_DISTINCTS) from PART_COL_STATS)) More details While doing TPC-DS Partitioned vs. Un-Partitioned runs I noticed that many of the plans are different, then I dumped the CBO logical plan and I found that join estimates are drastically different Unpartitioned schema : {code} 2015-02-10 11:33:27,624 DEBUG [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:apply(12624)) - Plan After Join Reordering: HiveProjectRel(store_sales_quantitycount=[$0], store_sales_quantityave=[$1], store_sales_quantitystdev=[$2], store_sales_quantitycov=[/($2, $1)], as_store_returns_quantitycount=[$3], as_store_returns_quantityave=[$4], as_store_returns_quantitystdev=[$5], store_returns_quantitycov=[/($5, $4)]): rowcount = 1.0, cumulative cost = {6.056835407771381E8 rows, 0.0 cpu, 0.0 io}, id = 2956 HiveAggregateRel(group=[{}], agg#0=[count($0)], agg#1=[avg($0)], agg#2=[stddev_samp($0)], agg#3=[count($1)], agg#4=[avg($1)], agg#5=[stddev_samp($1)]): rowcount = 1.0, cumulative cost = {6.056835407771381E8 rows, 0.0 cpu, 0.0 io}, id = 2954 HiveProjectRel($f0=[$4], $f1=[$8]): rowcount = 40.05611776795562, cumulative cost = {6.056835407771381E8 rows, 0.0 cpu, 0.0 io}, id = 2952 HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$1], ss_customer_sk=[$2], ss_ticket_number=[$3], ss_quantity=[$4], sr_item_sk=[$5], sr_customer_sk=[$6], sr_ticket_number=[$7], sr_return_quantity=[$8], d_date_sk=[$9], d_quarter_name=[$10]): rowcount = 40.05611776795562, cumulative cost = {6.056835407771381E8 rows, 0.0 cpu, 0.0 io}, id = 2982 HiveJoinRel(condition=[=($9, $0)], joinType=[inner]): rowcount = 40.05611776795562, cumulative cost = {6.056835407771381E8 rows, 0.0 cpu, 0.0 io}, id = 2980 HiveJoinRel(condition=[AND(AND(=($2, $6), =($1, $5)), =($3, $7))], joinType=[inner]): rowcount = 28880.460910696, cumulative cost = {6.05654559E8 rows, 0.0 cpu, 0.0 io}, id = 2964 HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], ss_customer_sk=[$3], ss_ticket_number=[$9], ss_quantity=[$10]): rowcount = 5.50076554E8, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2920 HiveTableScanRel(table=[[tpcds_bin_orc_200.store_sales]]): rowcount = 5.50076554E8, cumulative cost = {0}, id = 2822 HiveProjectRel(sr_item_sk=[$2], sr_customer_sk=[$3], sr_ticket_number=[$9], sr_return_quantity=[$10]): rowcount = 5.5578005E7, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 2923 HiveTableScanRel(table=[[tpcds_bin_orc_200.store_returns]]): rowcount = 5.5578005E7, cumulative
[jira] [Commented] (HIVE-10005) remove some unnecessary branches from the inner loop
[ https://issues.apache.org/jira/browse/HIVE-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487829#comment-14487829 ] Vikram Dixit K commented on HIVE-10005: --- It looks to me like this eliminates checking the done flag in every operator but still there is going to be a check for every row in the MapOperator/ExecReducer. Also, it looks like in the tez code path, the done flag check does not exist in the ReduceRecordProcessor. We could be forwarding rows even after an operator has said done. remove some unnecessary branches from the inner loop Key: HIVE-10005 URL: https://issues.apache.org/jira/browse/HIVE-10005 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10005.1.patch, HIVE-10005.2.patch, HIVE-10005.3.patch, HIVE-10005.4.patch Operator.forward is doing too much. There's no reason to do the done checking per row and update it inline. It's much more efficient to just do that when the event that completes an operator happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10263) CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional
[ https://issues.apache.org/jira/browse/HIVE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487828#comment-14487828 ] Laljo John Pullokkaran commented on HIVE-10263: --- [~jcamachorodriguez] HiveAggregate.bucketedInput seems like this is dangerous (effectively caching). The input could change from bucketed to unbuckled; Example input was a bucket join which then got changed to SMJ. CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional --- Key: HIVE-10263 URL: https://issues.apache.org/jira/browse/HIVE-10263 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10263.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10274) Send context and description to tez via dag info
[ https://issues.apache.org/jira/browse/HIVE-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488016#comment-14488016 ] Vikram Dixit K commented on HIVE-10274: --- +1 LGTM Send context and description to tez via dag info Key: HIVE-10274 URL: https://issues.apache.org/jira/browse/HIVE-10274 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10274.1.patch tez has a way to specify context and description (which is shown in the ui) for each dag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10274) Send context and description to tez via dag info
[ https://issues.apache.org/jira/browse/HIVE-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14488044#comment-14488044 ] Hitesh Shah commented on HIVE-10274: Looks fine. Maybe put LOG.debug(DagInfo: + dagInfo); within a if LOG.isDebugEnabled() ? Send context and description to tez via dag info Key: HIVE-10274 URL: https://issues.apache.org/jira/browse/HIVE-10274 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10274.1.patch tez has a way to specify context and description (which is shown in the ui) for each dag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10265) Hive CLI crashes on != inequality
[ https://issues.apache.org/jira/browse/HIVE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487692#comment-14487692 ] Hive QA commented on HIVE-10265: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723993/HIVE-10265.patch {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8665 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.ql.exec.tez.TestDynamicPartitionPruner.testSingleSourceMultipleFiltersOrdering1 org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3346/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3346/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3346/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723993 - PreCommit-HIVE-TRUNK-Build Hive CLI crashes on != inequality - Key: HIVE-10265 URL: https://issues.apache.org/jira/browse/HIVE-10265 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10265.patch It seems != is a supported inequality operator according to: [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators]. However, HiveCLI crashes if we try a query: {noformat} hive select * from src where key != '10'; [ERROR] Could not expand event java.lang.IllegalArgumentException: != '10';: event not found at jline.console.ConsoleReader.expandEvents(ConsoleReader.java:779) at jline.console.ConsoleReader.finishBuffer(ConsoleReader.java:631) at jline.console.ConsoleReader.accept(ConsoleReader.java:2019) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2666) at jline.console.ConsoleReader.readLine(ConsoleReader.java:2269) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:730) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[jira] [Resolved] (HIVE-10241) ACID: drop table doesn't acquire any locks
[ https://issues.apache.org/jira/browse/HIVE-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman resolved HIVE-10241. --- Resolution: Cannot Reproduce I can clearly see the right locks now ACID: drop table doesn't acquire any locks -- Key: HIVE-10241 URL: https://issues.apache.org/jira/browse/HIVE-10241 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman with Hive configured to use DbTxnManager, in DbTxnManager.acquireLocks() both plan.getInputs() and plan.getOutputs() are empty when drop table foo is executed and thus no locks are acquired. We should be acquiring X locks to make sure any readers of this table don't get data wiped out while read is in progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10251) HIVE-9664 makes hive depend on ivysettings.xml
[ https://issues.apache.org/jira/browse/HIVE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487722#comment-14487722 ] Sushanth Sowmyan commented on HIVE-10251: - [~cwsteinbach]: I had a discussion with [~arpitgupta] yesterday, who seemed to advocate the same idea of shipping ivysettings.xml inside the jar, since we already do things like shipping hadoop-default.xml inside hadoop jars, for example. On one hand, I like that idea, since it simplifies build release, and makes it so that this feature is controlled by our build, rather than whatever build/packaging scheme is used on top of hive, and whether or not the conf dir is appropriately set up. As in, if the conf dir were to be set up, we'd prefer that, but otherwise, we'd use ours. This has a very solid benefit for us. On the flip side, this feels similar to namespace pollution for me - users of our jars will now have an ivysettings.xml in their classpath because of us, and if they don't have similar dependency resolution semantics where they prefer an external provided ivysettings.xml over the one in classpath, they have an issue. Also, if they use a ivysettings.xml in their classpath mode of operation, then order of jar import becomes important. Ultimately, my distaste for this approach is not strong enough to offset the benefits, I think, and a case could still be made that a jar user could go ahead and shade our jar, or override as need be. We could also take a midling approach where the included xml file can be called something like hive-ivysettings-default.xml, and that's the name we depend on, then there's no confusion or pollution. [~nntnag17]: One way to unit test this would be as follows : a) Have a protected method in DependencyResolver that returns ClassLoader.getSystemResource(String), and use that method across DependencyResolver across the board, instead of directly calling ClassLoader.getSystemResource(...). b) In a unit test, then, you can extend DependencyResolver to DummyDependencyResolver, overriding that method so that it always returns null for resolving ivysettings.xml or whatever, and then you can show that you can still instantiate it, and resolve using it, although it would then return nothing, but without throwing exceptions. You could also use a DependencyResolver as-is, to show that it appropriately can resolve and fetch ivy urls when required. HIVE-9664 makes hive depend on ivysettings.xml -- Key: HIVE-10251 URL: https://issues.apache.org/jira/browse/HIVE-10251 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Anant Nag Labels: patch Attachments: HIVE-10251.1.patch, HIVE-10251.2.patch, HIVE-10251.simple.patch HIVE-9664 makes hive depend on the existence of ivysettings.xml, and if it is not present, it makes hive NPE when instantiating a CLISessionState. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.DependencyResolver.init(DependencyResolver.java:61) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:343) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:334) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:60) {noformat} This happens because of the following bit: {noformat} // If HIVE_HOME is not defined or file is not found in HIVE_HOME/conf then load default ivysettings.xml from class loader if (ivysettingsPath == null || !(new File(ivysettingsPath).exists())) { ivysettingsPath = ClassLoader.getSystemResource(ivysettings.xml).getFile(); _console.printInfo(ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR, + ivysettingsPath + will be used); } {noformat} This makes it so that an attempt to instantiate CliSessionState without an ivysettings.xml file will cause hive to fail with an NPE. Hive should not have a hard dependency on a ivysettings,xml being present, and this feature should gracefully fail in that case instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10279) LLAP: Allow the runtime to check whether a task can run to completion
[ https://issues.apache.org/jira/browse/HIVE-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10279: -- Fix Version/s: llap LLAP: Allow the runtime to check whether a task can run to completion - Key: HIVE-10279 URL: https://issues.apache.org/jira/browse/HIVE-10279 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap As part of the pre-empting running tasks, and deciding which tasks can run - allow the runtime to check whether a queued or running task has all it's sources complete and can run through to completion, without waiting for sources to finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0
[ https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486995#comment-14486995 ] Kaveen Raajan commented on HIVE-10277: -- Related Issue are found https://issues.apache.org/jira/browse/HIVE-2259 But the patch on above link is only applicable to ProcessFile(), not for ProcessCmd() at org.apache.hadoop.hive.cli.CliDriver.java Unable to process Comment line '--' in HIVE-1.1.0 - Key: HIVE-10277 URL: https://issues.apache.org/jira/browse/HIVE-10277 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.0.0 Reporter: Kaveen Raajan Priority: Minor Labels: hive I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben updated HIVE-10190: -- Attachment: HIVE-10190.02.patch CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Pengcheng Xiong Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, HIVE-10190.02.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10263) CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional
[ https://issues.apache.org/jira/browse/HIVE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10263: --- Attachment: HIVE-10263.01.cbo.patch [~jpullokkaran], you are absolutely right, they should not be cached. I modified the patch. Thanks CBO (Calcite Return Path): Aggregate checking input for bucketing should be conditional --- Key: HIVE-10263 URL: https://issues.apache.org/jira/browse/HIVE-10263 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10263.01.cbo.patch, HIVE-10263.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-8297) Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format
[ https://issues.apache.org/jira/browse/HIVE-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-8297: -- Assignee: Aihua Xu Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC format Key: HIVE-8297 URL: https://issues.apache.org/jira/browse/HIVE-8297 Project: Hive Issue Type: Bug Components: CLI, JDBC Affects Versions: 0.13.0 Environment: Linux Reporter: Doug Sedlak Assignee: Aihua Xu For the case: SELECT * FROM [table] JDBC direct reads the table backing data, versus cranking up a MR and creating a result set. Where table format is RCFile or ORC, incorrect results are delivered by JDBC direct read for TIMESTAMP columns. If you force a result set, correct data is returned. To reproduce using beeline: 1) Create this file as follows in HDFS. $ cat /tmp/ts.txt 2014-09-28 00:00:00 2014-09-29 00:00:00 2014-09-30 00:00:00 ctrl-D $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok: $ beeline !connect jdbc:hive2://host:port/db hive pass org.apache.hive.jdbc.HiveDriver drop table `TIMESTAMP_TEXT`; CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE; LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE `TIMESTAMP_TEXT`; select * from `TIMESTAMP_TEXT`; 3) In beeline create and load an RCFile from the TEXTFILE: drop table `TIMESTAMP_RCFILE`; CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile; INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`; 4) Demonstrate incorrect direct JDBC read versus good read by inducing result set creation: SELECT * FROM `TIMESTAMP_RCFILE`; ++ | timestamp_rcfile.ts | ++ | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL; ++ | timestamp_rcfile.ts | ++ | 2014-09-28 00:00:00.0 | | 2014-09-29 00:00:00.0 | | 2014-09-30 00:00:00.0 | ++ Note 1: The incorrect conduct demonstrated above replicates with a standalone Java/JDBC program. Note 2: Don't know if this is an issue with any other data types, also don't know what releases affected, however this occurs in Hive 13. Direct JDBC read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC wrong results are delivered, did not test any other file types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3299) Create UDF DAYNAME(date)
[ https://issues.apache.org/jira/browse/HIVE-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487980#comment-14487980 ] Hive QA commented on HIVE-3299: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724134/HIVE-3299.3.patch {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 8660 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file TestSparkClient - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_functions {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3347/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3347/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3347/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724134 - PreCommit-HIVE-TRUNK-Build Create UDF DAYNAME(date) - Key: HIVE-3299 URL: https://issues.apache.org/jira/browse/HIVE-3299 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Namitha Babychan Assignee: Alexander Pivovarov Labels: patch Attachments: HIVE-3299.1.patch.txt, HIVE-3299.2.patch, HIVE-3299.3.patch, HIVE-3299.4.patch, HIVE-3299.patch.txt, Hive-3299_Testcase.doc, udf_dayname.q, udf_dayname.q.out dayname(date/timestamp/string) Returns the name of the weekday for date. The language used for the name is English. select dayname('2015-04-08'); Wednesday -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487414#comment-14487414 ] Hive QA commented on HIVE-8858: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724210/HIVE-8858.2-spark.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8710 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/822/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/822/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-822/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724210 - PreCommit-HIVE-SPARK-Build Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3299) Create UDF DAYNAME(date)
[ https://issues.apache.org/jira/browse/HIVE-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-3299: -- Attachment: HIVE-3299.4.patch patch #4 - fixed typo in q file Create UDF DAYNAME(date) - Key: HIVE-3299 URL: https://issues.apache.org/jira/browse/HIVE-3299 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Namitha Babychan Assignee: Alexander Pivovarov Labels: patch Attachments: HIVE-3299.1.patch.txt, HIVE-3299.2.patch, HIVE-3299.3.patch, HIVE-3299.4.patch, HIVE-3299.patch.txt, Hive-3299_Testcase.doc, udf_dayname.q, udf_dayname.q.out dayname(date/timestamp/string) Returns the name of the weekday for date. The language used for the name is English. select dayname('2015-04-08'); Wednesday -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3299) Create UDF DAYNAME(date)
[ https://issues.apache.org/jira/browse/HIVE-3299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487712#comment-14487712 ] Alexander Pivovarov commented on HIVE-3299: --- the function can be replaced by {code} date_format(date, '') {code} Create UDF DAYNAME(date) - Key: HIVE-3299 URL: https://issues.apache.org/jira/browse/HIVE-3299 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0 Reporter: Namitha Babychan Assignee: Alexander Pivovarov Labels: patch Attachments: HIVE-3299.1.patch.txt, HIVE-3299.2.patch, HIVE-3299.3.patch, HIVE-3299.4.patch, HIVE-3299.patch.txt, Hive-3299_Testcase.doc, udf_dayname.q, udf_dayname.q.out dayname(date/timestamp/string) Returns the name of the weekday for date. The language used for the name is English. select dayname('2015-04-08'); Wednesday -- This message was sent by Atlassian JIRA (v6.3.4#6332)