[jira] [Commented] (HIVE-21436) "Malformed ORC file" when only one data-file in external table directory
[ https://issues.apache.org/jira/browse/HIVE-21436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794315#comment-16794315 ] Damien Carol commented on HIVE-21436: - [~archon] how have you generated this orc file ? > "Malformed ORC file" when only one data-file in external table directory > > > Key: HIVE-21436 > URL: https://issues.apache.org/jira/browse/HIVE-21436 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: archon gum >Priority: Blocker > Attachments: 1.jpg, 2.jpg, data1.orc > > > h1. env > * Presto 305 > * Hive 3.1.0 > > h1. step > > {code:java} > -- create external table using hiveserver2 > CREATE EXTERNAL TABLE `dw.dim_date2`( > `d` date > ) > STORED AS ORC > LOCATION > 'hdfs://datacenter1:8020/user/hive/warehouse/dw.db/dim_date2' > ; > -- upload the 'data1.orc' file from attachments > -- OR > -- insert one row using presto > insert into dim_date2 values (current_date); > {code} > > > when using `hiveserver2` to query, it works only at the first query and error > after then > !1.jpg! > > If I insert another row, it works > {code:java} > -- upload the 'data1.orc' file from attachments > -- OR > -- insert one row using presto > insert into dim_date2 values (current_date); > {code} > !2.jpg! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19502) Unable to insert values into table stored by JdbcStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-19502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-19502: Component/s: (was: JDBC) StorageHandler > Unable to insert values into table stored by JdbcStorageHandler > --- > > Key: HIVE-19502 > URL: https://issues.apache.org/jira/browse/HIVE-19502 > Project: Hive > Issue Type: Bug > Components: StorageHandler >Affects Versions: 2.3.3 >Reporter: Alexey Vakulenchuk >Priority: Blocker > > *General Info* > Hive version : 2.3.3 > {code:java} > commit 3f7dde31aed44b5440563d3f9d8a8887beccf0be > Author: Daniel Dai > Date: Wed Mar 28 16:46:29 2018 -0700 > Preparing for 2.3.3 release > {code} > Hadoop version: 2.7.2. > Engine > {code:java} > hive> set hive.execution.engine; > hive.execution.engine=mr{code} > *Step 1. Create table in mysql* > {code:java} > mysql> CREATE TABLE books (book_id INT, book_name VARCHAR(100), author_name > VARCHAR(100), book_isbn VARCHAR(100)); > {code} > *Step 2. Create table in hive* > {code:java} > CREATE EXTERNAL TABLE books ( > book_id INT, > book_name STRING, > author_name STRING, > book_isbn STRING > ) STORED BY "org.apache.hive.storage.jdbc.JdbcStorageHandler" > TBLPROPERTIES ( > "hive.sql.database.type" = "MYSQL", > "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", > "hive.sql.jdbc.url" = > "jdbc:mysql://node1:3306/mysql?user=root=123456", > "hive.sql.query" = "SELECT book_id, book_name, author_name, book_isbn FROM > books", > "hive.sql.column.mapping" = "book_id=book_id, book_name=book_name, > author_name=author_name, book_isbn=book_isbn", > "hive.sql.jdbc.input.table.name" = "books" > ); > {code} > *Step 3. Insert values into hive table* > {code:java} > insert into books values (1,'holybible','Jesus', '01'); > {code} > *Actual result:* > {code:java} > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1526038512481_0002, Tracking URL = > http://c74apache.com:8088/proxy/application_1526038512481_0002/ > Kill Command = /opt/hadoop/bin/hadoop job -kill job_1526038512481_0002 > Hadoop job information for Stage-3: number of mappers: 1; number of reducers: > 0 > 2018-05-11 07:40:27,312 Stage-3 map = 0%, reduce = 0% > 2018-05-11 07:40:40,947 Stage-3 map = 100%, reduce = 0% > Ended Job = job_1526038512481_0002 with errors > Error during job, obtaining debugging information... > Examining task ID: task_1526038512481_0002_m_00 (and more) from job > job_1526038512481_0002 > Task with the most failures(4): > - > Task ID: > task_1526038512481_0002_m_00 > URL: > > http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1526038512481_0002=task_1526038512481_0002_m_00 > - > Diagnostic Messages for this Task: > Error: java.lang.RuntimeException: Failed to load plan: > hdfs://localhost:9000/tmp/hive/hadoop/b943d5b2-2de9-424d-b7bb-6d9ccb1e6465/hive_2018-05-11_07-40-19_643_183408830372672971-1/-mr-10002/a1bd8dbb-0970-41bc-9e95-d5fd2aeea47c/map.xml > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:481) > at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:313) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:394) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:665) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:658) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:692) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to > find class: org.apache.hive.storage.jdbc.JdbcInputFormat > Serialization trace: > inputFileFormatClass (org.apache.hadoop.hive.ql.plan.TableDesc) > tableInfo (org.apache.hadoop.hive.ql.plan.FileSinkDesc) > conf (org.apache.hadoop.hive.ql.exec.FileSinkOperator) > childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > at >
[jira] [Updated] (HIVE-19950) Hive ACID NOT LOCK LockComponent Correctly
[ https://issues.apache.org/jira/browse/HIVE-19950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-19950: Description: Hi, When using Streaming Mutation recently, I found LockComponents were not locked correctly by current transaction. Below is my test case: Step1: Begin a transaction with transactionId 126, and the transaction locks a table. Then hangs the transaction. The lock information were correctly restored in mariaDB. {code:sql} MariaDB [hive]> select HL_LOCK_EXT_ID,HL_LOCK_INT_ID,HL_TXNID,HL_DB,HL_TABLE,HL_PARTITION,HL_LOCK_STATE,HL_LOCK_TYPE,HL_ACQUIRED_AT,HL_BLOCKEDBY_EXT_ID,HL_BLOCKEDBY_INT_ID from HIVE_LOCKS; {code} | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB | HL_TABLE | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_ACQUIRED_AT | HL_BLOCKEDBY_EXT_ID | HL_BLOCKEDBY_INT_ID | | 384 | 1 | 126 | test_acid | acid_test | NULL | a | w | 1529512857000 | NULL | NULL | Step2: Begin the other transaction with a transactionId 127 before previous transaction 126 finished. Transaction 127 tries to lock the same table too, but failed at first attempt. The lock information were correctly restored in mariaDB, Lock 385 was blocked by Lock 384. {code:sql} MariaDB [hive]> select HL_LOCK_EXT_ID,HL_LOCK_INT_ID,HL_TXNID,HL_DB,HL_TABLE,HL_PARTITION,HL_LOCK_STATE,HL_LOCK_TYPE,HL_ACQUIRED_AT,HL_BLOCKEDBY_EXT_ID,HL_BLOCKEDBY_INT_ID from HIVE_LOCKS;{code} | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB | HL_TABLE | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_ACQUIRED_AT | HL_BLOCKEDBY_EXT_ID | HL_BLOCKEDBY_INT_ID | | 384 | 1 | 126 | test_acid | acid_test | NULL | a | w | 1529512857000 | NULL | NULL | | 385 | 1 | 127 | test_acid | acid_test | NULL | w | w | NULL | 384 | 1 | Step3: Then transaction 127 tries to lock the table for a second retry after 30s with another lockId: 386, this time it successfully locked the table, whereas transaction 126 is still holding the lock. Lock informations in MetaStore DB: {code:sql} MariaDB [hive]> select HL_LOCK_EXT_ID,HL_LOCK_INT_ID,HL_TXNID,HL_DB,HL_TABLE,HL_PARTITION,HL_LOCK_STATE,HL_LOCK_TYPE,HL_ACQUIRED_AT,HL_BLOCKEDBY_EXT_ID,HL_BLOCKEDBY_INT_ID from HIVE_LOCKS; {code} | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB | HL_TABLE | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_ACQUIRED_AT | HL_BLOCKEDBY_EXT_ID | HL_BLOCKEDBY_INT_ID | |384| 1 |126| test_acid | acid_test | NULL | a | w | 1529512857000 | NULL | NULL | |385| 1 |127| test_acid |acid_test| NULL| w | w| NULL|384 | 1 | |386| 1 |127| test_acid | acid_test | NULL | a | w | 1529513069000 | NULL | NULL | After going through the code. I found it dosen't care about other transaction's lock on the LockComponents in second retry. I wonder if i use it in a wrong way, or misunderstand sth about ACID in hive. Thanks was: Hi, When using Streaming Mutation recently, I found LockComponents were not locked correctly by current transaction. Below is my test case: Step1: Begin a transaction with transactionId 126, and the transaction locks a table. Then hangs the transaction. The lock information were correctly restored in mariaDB {code:java} MariaDB [hive]> select HL_LOCK_EXT_ID,HL_LOCK_INT_ID,HL_TXNID,HL_DB,HL_TABLE,HL_PARTITION,HL_LOCK_STATE,HL_LOCK_TYPE,HL_ACQUIRED_AT,HL_BLOCKEDBY_EXT_ID,HL_BLOCKEDBY_INT_ID from HIVE_LOCKS; +++--+---+---+ | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB | HL_TABLE | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_ACQUIRED_AT | HL_BLOCKEDBY_EXT_ID | HL_BLOCKEDBY_INT_ID | +++--+---+---+ | 384 | 1 | 126 | test_acid | acid_test | NULL | a | w | 1529512857000 | NULL | NULL | +++--+---+---+ {code} Step2: Begin the other transaction with a transactionId 127 before previous transaction 126 finished. Transaction 127 tries to lock the same table too, but failed at first attempt. The lock information were correctly restored in mariaDB, Lock 385 was blocked by Lock 384. {code:java} MariaDB [hive]> select HL_LOCK_EXT_ID,HL_LOCK_INT_ID,HL_TXNID,HL_DB,HL_TABLE,HL_PARTITION,HL_LOCK_STATE,HL_LOCK_TYPE,HL_ACQUIRED_AT,HL_BLOCKEDBY_EXT_ID,HL_BLOCKEDBY_INT_ID from HIVE_LOCKS; +++--+---+---+--+---+--++-+-+ | HL_LOCK_EXT_ID | HL_LOCK_INT_ID | HL_TXNID | HL_DB | HL_TABLE | HL_PARTITION | HL_LOCK_STATE | HL_LOCK_TYPE | HL_ACQUIRED_AT | HL_BLOCKEDBY_EXT_ID | HL_BLOCKEDBY_INT_ID |
[jira] [Commented] (HIVE-20232) Basic division operator not working for select statement with join
[ https://issues.apache.org/jira/browse/HIVE-20232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620338#comment-16620338 ] Damien Carol commented on HIVE-20232: - Could you write the output of `DESCRIBE ` ? > Basic division operator not working for select statement with join > -- > > Key: HIVE-20232 > URL: https://issues.apache.org/jira/browse/HIVE-20232 > Project: Hive > Issue Type: Bug > Components: Operators >Reporter: Michael Lee >Priority: Blocker > > Hello, > I am trying to divide the values of two fields that have been joined together > on multiple criteria (offerlevelid, visit_date, days_to_action). For some > rows, the quotient is correct, but for other rows, the result is zero. See > below: > TABLE A: mlee.mit_test1 > select * from mlee.mit_test1 limit 5; > > ||offerlevelid||action_date||visit_date||days_to_action||cluster||cnt|| > |29992|_2018-07-11_|_2018-06-28_|13|11158|1| > |_29991_|_2018-07-12_|_2018-06-18_|24 |11158 |0 | > |_5279_|_2018-07-01_|_2018-05-30_|32|11158 |10 | > |_5279_|_2018-07-01_|_2018-06-02_ |29 |11158 |1 | > |_5279_|_2018-07-02_|_2018-06-29_ |3 |11158 |3 | > > TABLE B: mlee_p2p.num_at_visit_vd > select * from mlee_p2p.num_at_visit_vd limit 5; > ||offerlevelid||action_date||visit_date||days_to_action||cnt|| > |5279|2018-07-06|_2018-06-17_| 19|1696 | > |_5279_|_2018-07-07_|_2018-06-07_| 30|2072 | > |_29991_|_2018-07-11_|_2018-07-09_| 2|361| > |_29991_|_2018-07-10_|_2018-06-10_| 30|116| > |29992 |_2018-07-02_|_2018-06-27_| 5|0 | > > When I attempt to perform division on a.cnt / b.cnt, the results do not make > sense. Specifically, there are results of zero where a.cnt and b.cnt are > integer values. I tried casting both as doubles, but that did not work > either. See below, where I've bolded the "prob" values that do not make > sense. Please advise! > > select > a.offerlevelid, > a.days_to_action, > a.visit_date, > a.cluster, > a.cnt at_cluster_vd_dta_cnt, > b.cnt at_vd_dta_cnt, > a.cnt/b.cnt prob > from mlee.mit_test1 a > join mlee_p2p.num_at_visit_vd b on a.offerlevelid=b.offerlevelid > and a.visit_date = b.visit_date > and a.days_to_action = b.days_to_action > order by a.days_to_action,a.visit_date > limit 2000; > ||offerlevelid||days_to_action||visit_date||cluster||at_cluster_vd_dta_cnt||at_vd_dta_cnt||prob|| > |29991|0|2018-07-01 |11158|1|111|.009009009009009009| > |5279|0|2018-07-01|11158|8|3255|_0.002457757296466974_| > |_29992_|0|_2018-07-02_ |11158|0|1|0.0| > |_29991_|0|_2018-07-02_ |11158|2|247|*0.0*| > |_5279_|0|_2018-07-02_ |11158|3|2268|_0.0013227513227513227_| > |_5279_|0|_2018-07-03_|11158|4|3206|_0.0012476606363069245_| > |_29991_|0|_2018-07-03_|11158|1|293|*0.0*| > |_5279_|0|_2018-07-04_|11158|4|3523|_0.0011353959693443088_| > |_29991_|0|_2018-07-04_|11158|2|203|_0.009852216748768473_| > |_29992_|0|_2018-07-05_|11158|0|2|*0.0*| > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19747) "GRANT ALL TO USER" failed with NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-19747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-19747: Description: If you issue the command {code:sql} grant all to user abc {code} you will see the following NPE exception. Seems the type in hivePrivObject is not initialized. {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:265) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.checkRequiredPrivileges(GrantPrivAuthUtils.java:64) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.authorize(GrantPrivAuthUtils.java:50) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.grantPrivileges(SQLStdHiveAccessController.java:179) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.grantPrivileges(SQLStdHiveAccessControllerWrapper.java:70) at org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.grantPrivileges(HiveAuthorizerImpl.java:48) at org.apache.hadoop.hive.ql.exec.DDLTask.grantOrRevokePrivileges(DDLTask.java:1123 {noformat} was: If you issue the command 'grant all to user abc', you will see the following NPE exception. Seems the type in hivePrivObject is not initialized. {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:265) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.checkRequiredPrivileges(GrantPrivAuthUtils.java:64) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.authorize(GrantPrivAuthUtils.java:50) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.grantPrivileges(SQLStdHiveAccessController.java:179) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.grantPrivileges(SQLStdHiveAccessControllerWrapper.java:70) at org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.grantPrivileges(HiveAuthorizerImpl.java:48) at org.apache.hadoop.hive.ql.exec.DDLTask.grantOrRevokePrivileges(DDLTask.java:1123 {noformat} > "GRANT ALL TO USER" failed with NullPointerException > > > Key: HIVE-19747 > URL: https://issues.apache.org/jira/browse/HIVE-19747 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Priority: Minor > > If you issue the command > {code:sql} > grant all to user abc > {code} > you will see the following NPE exception. Seems the type in hivePrivObject is > not initialized. > {noformat} > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:265) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.checkRequiredPrivileges(GrantPrivAuthUtils.java:64) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.authorize(GrantPrivAuthUtils.java:50) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.grantPrivileges(SQLStdHiveAccessController.java:179) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.grantPrivileges(SQLStdHiveAccessControllerWrapper.java:70) > at > org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.grantPrivileges(HiveAuthorizerImpl.java:48) > at > org.apache.hadoop.hive.ql.exec.DDLTask.grantOrRevokePrivileges(DDLTask.java:1123 > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18394) Materialized view: "Create Materialized View" should default to rewritable ones
[ https://issues.apache.org/jira/browse/HIVE-18394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-18394: Description: This is a usability ticket, since it is possible to end up creating materialized views and realize that they need an additional flag to be picked up by the optimizer to do rewrites to. {code:sql} create materialized view ca as select * from customer, customer_address where c_current_addr_sk = ca_address_sk; set hive.materializedview.rewriting=true; select count(1) from customer, customer_address where c_current_addr_sk = ca_address_sk; -- does not use materialized view {code} Needs another step {code:sql} alter materialized view ca enable rewrite; {code} And then, it kicks in {code:sql} select count(1) from customer, customer_address where c_current_addr_sk = ca_address_sk; OK 1200 Time taken: 0.494 seconds, Fetched: 1 row(s) {code} was: This is a usability ticket, since it is possible to end up creating materialized views and realize that they need an additional flag to be picked up by the optimizer to do rewrites to. {code} create materialized view ca as select * from customer, customer_address where c_current_addr_sk = ca_address_sk; set hive.materializedview.rewriting=true; select count(1) from customer, customer_address where c_current_addr_sk = ca_address_sk; -- does not use materialized view {code} Needs another step {code} alter materialized view ca enable rewrite; {code} And then, it kicks in {code} select count(1) from customer, customer_address where c_current_addr_sk = ca_address_sk; OK 1200 Time taken: 0.494 seconds, Fetched: 1 row(s) {code} > Materialized view: "Create Materialized View" should default to rewritable > ones > --- > > Key: HIVE-18394 > URL: https://issues.apache.org/jira/browse/HIVE-18394 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez >Priority: Blocker > > This is a usability ticket, since it is possible to end up creating > materialized views and realize that they need an additional flag to be picked > up by the optimizer to do rewrites to. > {code:sql} > create materialized view ca as select * from customer, customer_address where > c_current_addr_sk = ca_address_sk; > set hive.materializedview.rewriting=true; > select count(1) from customer, customer_address where c_current_addr_sk = > ca_address_sk; -- does not use materialized view > {code} > Needs another step > {code:sql} > alter materialized view ca enable rewrite; > {code} > And then, it kicks in > {code:sql} > select count(1) from customer, customer_address where c_current_addr_sk = > ca_address_sk; > OK > 1200 > Time taken: 0.494 seconds, Fetched: 1 row(s) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17591) Getting error while select,insert,update,delete to hive from IBM SPSS Modeler server
[ https://issues.apache.org/jira/browse/HIVE-17591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376060#comment-16376060 ] Damien Carol commented on HIVE-17591: - [~gundaks] I don't think it is a good place for this kind of pb. This seems to be a configuration issue for access on this table. Try an email on HIVE user mailing list instead. > Getting error while select,insert,update,delete to hive from IBM SPSS > Modeler server > -- > > Key: HIVE-17591 > URL: https://issues.apache.org/jira/browse/HIVE-17591 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 >Affects Versions: 1.2.2 > Environment: Redhat 7.2 >Reporter: sudarshana >Priority: Blocker > Fix For: 0.13.1 > > > We are getting while we are connecting from IBM SPSS Modeler server which > intern use SPSS analytical server. But we have everything(hive authorization, > privilege ..etc) done. > A SQL exception occurred. The error is: > org.apache.hive.service.cli.HiveSQLException: Error while complining > statement:FAILED: HiveAccessControlException Permission denied: > Principal[name=,type] does not have following previlegers for operation > QUERY[[SELECT] on Object[type=TABLE_OR_VIEW,name]] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-16924) Support distinct in presence Gby
[ https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-16924: Description: {code:sql} create table e011_01 (c1 int, c2 smallint); insert into e011_01 values (1, 1), (2, 2); {code} These queries should work: {code:sql} select distinct c1, count(*) from e011_01 group by c1; select distinct c1, avg(c2) from e011_01 group by c1; {code} Currently, you get : FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the same query. Error encountered near token 'c1' was: create table e011_01 (c1 int, c2 smallint); insert into e011_01 values (1, 1), (2, 2); These queries should work: select distinct c1, count(*) from e011_01 group by c1; select distinct c1, avg(c2) from e011_01 group by c1; Currently, you get : FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the same query. Error encountered near token 'c1' > Support distinct in presence Gby > - > > Key: HIVE-16924 > URL: https://issues.apache.org/jira/browse/HIVE-16924 > Project: Hive > Issue Type: New Feature > Components: Query Planning >Reporter: Carter Shanklin >Assignee: Julian Hyde > Attachments: HIVE-16924.01.patch > > > {code:sql} > create table e011_01 (c1 int, c2 smallint); > insert into e011_01 values (1, 1), (2, 2); > {code} > These queries should work: > {code:sql} > select distinct c1, count(*) from e011_01 group by c1; > select distinct c1, avg(c2) from e011_01 group by c1; > {code} > Currently, you get : > FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the > same query. Error encountered near token 'c1' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12384) Union Operator may produce incorrect result on TEZ
[ https://issues.apache.org/jira/browse/HIVE-12384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12384: Description: Union queries may produce incorrect result on TEZ. TEZ removes union op, thus might loose the implicit cast in union op. Reproduction test case: {code:sql} set hive.cbo.enable=false; set hive.execution.engine=tez; select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x from (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) x from (select * from src limit 2) s2 union all select '1' x from (select * from src limit 2) s3)u order by y; select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x from (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) x from (select * from src limit 2) s2 union all select cast (null as string) x from (select * from src limit 2) s3)u order by y; {code} was: Union queries may produce incorrect result on TEZ. TEZ removes union op, thus might loose the implicit cast in union op. Reproduction test case: set hive.cbo.enable=false; set hive.execution.engine=tez; select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x from (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) x from (select * from src limit 2) s2 union all select '1' x from (select * from src limit 2) s3)u order by y; select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x from (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) x from (select * from src limit 2) s2 union all select cast (null as string) x from (select * from src limit 2) s3)u order by y; > Union Operator may produce incorrect result on TEZ > -- > > Key: HIVE-12384 > URL: https://issues.apache.org/jira/browse/HIVE-12384 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0, 1.0.0, 1.1.0, 1.0.1, 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > Fix For: 2.0.0 > > Attachments: HIVE-12384.1.patch, HIVE-12384.2.patch, > HIVE-12384.3.patch, HIVE-12384.4.patch > > > Union queries may produce incorrect result on TEZ. > TEZ removes union op, thus might loose the implicit cast in union op. > Reproduction test case: > {code:sql} > set hive.cbo.enable=false; > set hive.execution.engine=tez; > select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x > from (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) > x from (select * from src limit 2) s2 union all select '1' x from > (select * from src limit 2) s3)u order by y; > select (x/sum(x) over()) as y from(select cast(1 as decimal(10,0)) as x from > (select * from src limit 2)s1 union all select cast(1 as decimal(10,0)) x > from (select * from src limit 2) s2 union all select cast (null as string) x > from (select * from src limit 2) s3)u order by y; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6905) Implement Auto increment, primary-foreign Key, not null constraints and default value in Hive Table columns
[ https://issues.apache.org/jira/browse/HIVE-6905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-6905: --- Description: For Hive to replace a modern datawarehouse based on RDBMS, it must have support for keys, constraints, auto-increment values, surrogate keys and not null features etc. Many customers do not move their EDW to Hive due to these reasons as these have been challenging to maintain in Hive. This must be implemented once https://issues.apache.org/jira/browse/HIVE-5317 for Updates, Deletes and Inserts are done in Hive. This should be next step for Hive enhancement to take it closer to a very wide mainstream adoption.. was: For Hive to replace a modern datawarehouse based on RDBMS, it must have support for keys, constraints, auto-increment values, surrogate keys and not null features etc. Many customers do not move their EDW to Hive due to these reasons as these have been challenging to maintain in Hive. This must be implemented once https://issues.apache.org/jira/browse/HIVE-5317 for Updates, Deletes and Inserts are done in Hive. This should be next stop for Hive enhancement to take it closer to a very wide mainstream adoption.. > Implement Auto increment, primary-foreign Key, not null constraints and > default value in Hive Table columns > > > Key: HIVE-6905 > URL: https://issues.apache.org/jira/browse/HIVE-6905 > Project: Hive > Issue Type: New Feature > Components: Database/Schema >Affects Versions: 0.14.0 >Reporter: Pardeep Kumar > > For Hive to replace a modern datawarehouse based on RDBMS, it must have > support for keys, constraints, auto-increment values, surrogate keys and not > null features etc. Many customers do not move their EDW to Hive due to these > reasons as these have been challenging to maintain in Hive. > This must be implemented once https://issues.apache.org/jira/browse/HIVE-5317 > for Updates, Deletes and Inserts are done in Hive. This should be next step > for Hive enhancement to take it closer to a very wide mainstream adoption.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13554) [Umbrella] SQL:2011 compliance
[ https://issues.apache.org/jira/browse/HIVE-13554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573178#comment-15573178 ] Damien Carol commented on HIVE-13554: - [~ashutoshc] do you have any specification document? What is your reference? > [Umbrella] SQL:2011 compliance > -- > > Key: HIVE-13554 > URL: https://issues.apache.org/jira/browse/HIVE-13554 > Project: Hive > Issue Type: Improvement > Components: SQL >Affects Versions: 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > > There are various gaps in language which needs to be addressed to bring Hive > under SQL:2011 compliance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568493#comment-15568493 ] Damien Carol commented on HIVE-13280: - Yes this fix the pb. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol resolved HIVE-13280. - Resolution: Invalid Assignee: Damien Carol > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol >Assignee: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568492#comment-15568492 ] Damien Carol commented on HIVE-13280: - Yes this fix the pb. > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9853) Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java
[ https://issues.apache.org/jira/browse/HIVE-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280335#comment-15280335 ] Damien Carol commented on HIVE-9853: No answer from reporter from many months. > Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java > > > Key: HIVE-9853 > URL: https://issues.apache.org/jira/browse/HIVE-9853 > Project: Hive > Issue Type: Test >Affects Versions: 1.0.0 >Reporter: Laurent GAY >Assignee: Damien Carol > Attachments: correct_version_test.patch > > > The test getHiveVersion in class > org.apache.hive.hcatalog.templeton.TestWebHCatE2e check bad format of version. > It checks "0.[0-9]+.[0-9]+.*" and not "1.[0-9]+.[0-9]+.*" > This test is failed for hive, tag "release-1.0.0" > I propose a patch to correct it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13734) How to initialize hive metastore database
[ https://issues.apache.org/jira/browse/HIVE-13734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280321#comment-15280321 ] Damien Carol commented on HIVE-13734: - [~ljy423897608] You JIRA ticket seems to be a question of configuration, no? Please, if it's the case use the user mailing list. You will have more feedback for this kind of questions, trust me. If it's ok, I will chnage this JIRA into "not a bug". > How to initialize hive metastore database > - > > Key: HIVE-13734 > URL: https://issues.apache.org/jira/browse/HIVE-13734 > Project: Hive > Issue Type: Test > Components: Configuration, Database/Schema >Affects Versions: 2.0.0 >Reporter: Lijiayong >Assignee: Lijiayong > Labels: mesosphere > Fix For: 2.0.0 > > Original Estimate: 504h > Remaining Estimate: 504h > > When run "hive",there is a mistake:Exception in thread "main" > java.lang.RuntimeException:Hive metastore database is not initializad.Please > use schematool to create the schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9853) Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java
[ https://issues.apache.org/jira/browse/HIVE-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9853: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) > Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java > > > Key: HIVE-9853 > URL: https://issues.apache.org/jira/browse/HIVE-9853 > Project: Hive > Issue Type: Test >Affects Versions: 1.0.0 >Reporter: Laurent GAY >Assignee: Damien Carol > Attachments: correct_version_test.patch > > > The test getHiveVersion in class > org.apache.hive.hcatalog.templeton.TestWebHCatE2e check bad format of version. > It checks "0.[0-9]+.[0-9]+.*" and not "1.[0-9]+.[0-9]+.*" > This test is failed for hive, tag "release-1.0.0" > I propose a patch to correct it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13734) How to initialize hive metastore database
[ https://issues.apache.org/jira/browse/HIVE-13734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15280324#comment-15280324 ] Damien Carol commented on HIVE-13734: - Here => https://hive.apache.org/mailing_lists.html > How to initialize hive metastore database > - > > Key: HIVE-13734 > URL: https://issues.apache.org/jira/browse/HIVE-13734 > Project: Hive > Issue Type: Test > Components: Configuration, Database/Schema >Affects Versions: 2.0.0 >Reporter: Lijiayong >Assignee: Damien Carol > Labels: mesosphere > Fix For: 2.0.0 > > Original Estimate: 504h > Remaining Estimate: 504h > > When run "hive",there is a mistake:Exception in thread "main" > java.lang.RuntimeException:Hive metastore database is not initializad.Please > use schematool to create the schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13734) How to initialize hive metastore database
[ https://issues.apache.org/jira/browse/HIVE-13734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol reassigned HIVE-13734: --- Assignee: Damien Carol (was: Lijiayong) > How to initialize hive metastore database > - > > Key: HIVE-13734 > URL: https://issues.apache.org/jira/browse/HIVE-13734 > Project: Hive > Issue Type: Test > Components: Configuration, Database/Schema >Affects Versions: 2.0.0 >Reporter: Lijiayong >Assignee: Damien Carol > Labels: mesosphere > Fix For: 2.0.0 > > Original Estimate: 504h > Remaining Estimate: 504h > > When run "hive",there is a mistake:Exception in thread "main" > java.lang.RuntimeException:Hive metastore database is not initializad.Please > use schematool to create the schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13264) JDBC driver makes 2 Open Session Calls for every open session
[ https://issues.apache.org/jira/browse/HIVE-13264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201213#comment-15201213 ] Damien Carol commented on HIVE-13264: - [~nithinmahesh] you modified thrift protocol in your patch ?... Adding *TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V9* is a little bit intrusive IMHO. > JDBC driver makes 2 Open Session Calls for every open session > - > > Key: HIVE-13264 > URL: https://issues.apache.org/jira/browse/HIVE-13264 > Project: Hive > Issue Type: Bug > Components: JDBC >Reporter: NITHIN MAHESH >Assignee: NITHIN MAHESH > Labels: jdbc > Attachments: HIVE-13264.1.patch, HIVE-13264.2.patch, HIVE-13264.patch > > > When HTTP is used as the transport mode by the Hive JDBC driver, we noticed > that there is an additional open/close session just to validate the > connection. > > TCLIService.Iface client = new TCLIService.Client(new > TBinaryProtocol(transport)); > TOpenSessionResp openResp = client.OpenSession(new TOpenSessionReq()); > if (openResp != null) { > client.CloseSession(new > TCloseSessionReq(openResp.getSessionHandle())); > } > > The open session call is a costly one and should not be used to test > transport. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Description: With a simple query (select from orc table and insert into HBase external table): {code:sql} insert into table register.register select * from aa_temp {code} The aa_temp table have 45 orc files. It generate 45 mappers. Some mappers fail with this error: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is fine because there are only one mapper. was: With a simple query (select from orc table and insert into HBase external table): {code:sql} insert into table register.register select * from aa_temp {code} Some mapper fail with this error: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Component/s: HBase Handler > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > Some mapper fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Description: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Damien Carol > > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority
[ https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13046: Description: When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive will lowercase it to {{1.2.3-snapshot}} due to: {code:title=DependencyResolver.java#84} String[] authorityTokens = authority.toLowerCase().split(":"); {code} We should not {{.lowerCase()}}. RB: https://reviews.apache.org/r/43513 was: When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, Hive will lowercase it to {{1.2.3-snapshot}} due to: {code:title=DependencyResolver.java} String[] authorityTokens = authority.toLowerCase().split(":"); {code} We should not {{.lowerCase()}}. RB: https://reviews.apache.org/r/43513 > DependencyResolver should not lowercase the dependency URI's authority > -- > > Key: HIVE-13046 > URL: https://issues.apache.org/jira/browse/HIVE-13046 > Project: Hive > Issue Type: Bug >Reporter: Anthony Hsu >Assignee: Anthony Hsu > Attachments: HIVE-13046.1.patch > > > When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, > Hive will lowercase it to {{1.2.3-snapshot}} due to: > {code:title=DependencyResolver.java#84} > String[] authorityTokens = authority.toLowerCase().split(":"); > {code} > We should not {{.lowerCase()}}. > RB: https://reviews.apache.org/r/43513 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11647) Bump hbase version to 1.1.1
[ https://issues.apache.org/jira/browse/HIVE-11647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136896#comment-15136896 ] Damien Carol commented on HIVE-11647: - This is already in master. > Bump hbase version to 1.1.1 > --- > > Key: HIVE-11647 > URL: https://issues.apache.org/jira/browse/HIVE-11647 > Project: Hive > Issue Type: Sub-task > Components: HBase Handler >Reporter: Swarnim Kulkarni >Assignee: Swarnim Kulkarni > Attachments: HIVE-11647.1.patch.txt, HIVE-11647.2.patch.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11349) Update HBase metastore hbase version to 1.1.1
[ https://issues.apache.org/jira/browse/HIVE-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11349: Component/s: HBase Metastore > Update HBase metastore hbase version to 1.1.1 > - > > Key: HIVE-11349 > URL: https://issues.apache.org/jira/browse/HIVE-11349 > Project: Hive > Issue Type: Task > Components: HBase Metastore, Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11349.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11379) Bump Tephra version to 0.6.0
[ https://issues.apache.org/jira/browse/HIVE-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11379: Component/s: HBase Metastore > Bump Tephra version to 0.6.0 > > > Key: HIVE-11379 > URL: https://issues.apache.org/jira/browse/HIVE-11379 > Project: Hive > Issue Type: Task > Components: HBase Metastore, Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: hbase-metastore-branch > > Attachments: HIVE-11379.patch > > > HIVE-11349 (which moved the HBase version to 1.1.1) moved Tephra support to > 0.5.1-SNAPSHOT because that was the only thing that supported HBase 1.0. > Since Tephra has now released a 0.6 that supports HBase 1.0 we should move to > it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests
[ https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12167: Component/s: HBase Metastore > HBase metastore causes massive number of ZK exceptions in MiniTez tests > --- > > Key: HIVE-12167 > URL: https://issues.apache.org/jira/browse/HIVE-12167 > Project: Hive > Issue Type: Bug > Components: HBase Metastore >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12167.patch > > > I ran some random test (vectorization_10) with HBase metastore for unrelated > reason, and I see large number of exceptions in hive.log > {noformat} > $ grep -c "ConnectionLoss" hive.log > 52 > $ grep -c "Connection refused" hive.log > 1014 > {noformat} > These log lines' count has increased by ~33% since merging llap branch, but > it is still high before that (39/~700) for the same test). These lines are > not present if I disable HBase metastore. > The exceptions are: > {noformat} > 2015-10-13T17:51:06,232 WARN [Thread-359-SendThread(localhost:2181)]: > zookeeper.ClientCnxn (ClientCnxn.java:run(1102)) - Session 0x0 for server > null, unexpected error, closing socket connection and attempting reconnect > java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > ~[?:1.8.0_45] > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > ~[?:1.8.0_45] > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) > ~[zookeeper-3.4.6.jar:3.4.6-1569965] > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > [zookeeper-3.4.6.jar:3.4.6-1569965] > {noformat} > that is retried for some seconds and then > {noformat} > 2015-10-13T17:51:22,867 WARN [Thread-359]: zookeeper.ZKUtil > (ZKUtil.java:checkExists(544)) - hconnection-0x1da6ef180x0, > quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode > (/hbase/hbaseid) > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/hbaseid > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > ~[zookeeper-3.4.6.jar:3.4.6-1569965] > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[zookeeper-3.4.6.jar:3.4.6-1569965] > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) > ~[zookeeper-3.4.6.jar:3.4.6-1569965] > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222) > ~[hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635) > [hbase-client-1.1.1.jar:1.1.1] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:1.8.0_45] > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > [?:1.8.0_45] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > [?:1.8.0_45] > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > [?:1.8.0_45] > at > org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144) > [hbase-client-1.1.1.jar:1.1.1] > at > org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56) > [hive-metastore-2.0.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:227) > [hive-metastore-2.0.0-SNAPSHOT.jar:?] > at > org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.(HBaseReadWrite.java:83) > [hive-metastore-2.0.0-SNAPSHOT.jar:?] > at >
[jira] [Commented] (HIVE-11111) Insert on skewed table with STORED AS DIRECTORY is broken
[ https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123337#comment-15123337 ] Damien Carol commented on HIVE-1: - [~eyushin] Thanks This fixed my issue. What are the sides effects of _hive.mapred.supports.subdirectories_ ? Is it safe to set it to _true_ all the time? Why the default value is not _true_ ? > Insert on skewed table with STORED AS DIRECTORY is broken > - > > Key: HIVE-1 > URL: https://issues.apache.org/jira/browse/HIVE-1 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Damien Carol > > Doing these queries fails: > {code:sql} > RESET; > DROP TABLE IF EXISTS testskew; > CREATE TABLE IF NOT EXISTS testskew (key int, value STRING) > SKEWED BY (key) ON (1,5,6) STORED AS DIRECTORIES > STORED AS ORC; > insert into testskew VALUES > (1, 'one'), > (1, 'one'), > (1, 'one'), > (1, 'one'), > (1, 'one'), > (1, 'one'), > (2, 'two'), > (3, 'three'), > (5, 'five'), > (5, 'five'), > (5, 'five'), > (5, 'five'), > (5, 'five'), > (6, 'six'), > (6, 'six'), > (6, 'six'), > (6, 'six'), > (6, 'six'), > (6, 'six'); > {code} > Stacktrace: > {noformat} > INFO : Session is already open > INFO : > INFO : Status: Running (Executing on YARN cluster with App id > application_1434957292922_0059) > INFO : Map 1: 0/1 > INFO : Map 1: 0(+1)/1 > INFO : Map 1: 1/1 > INFO : Loading data to table test.testskew from > hdfs://nc-h07/user/hive/warehouse/test.db/testskew/.hive-staging_hive_2015-06-25_17-29-34_385_4424227988595852796-14/-ext-1 > ERROR : Failed with exception checkPaths: > hdfs://nc-h07/user/hive/warehouse/test.db/testskew/.hive-staging_hive_2015-06-25_17-29-34_385_4424227988595852796-14/-ext-1 > has nested directory > hdfs://nc-h07/user/hive/warehouse/test.db/testskew/.hive-staging_hive_2015-06-25_17-29-34_385_4424227988595852796-14/-ext-1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > org.apache.hadoop.hive.ql.metadata.HiveException: checkPaths: > hdfs://nc-h07/user/hive/warehouse/test.db/testskew/.hive-staging_hive_2015-06-25_17-29-34_385_4424227988595852796-14/-ext-1 > has nested directory > hdfs://nc-h07/user/hive/warehouse/test.db/testskew/.hive-staging_hive_2015-06-25_17-29-34_385_4424227988595852796-14/-ext-1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME > at org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2466) > at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2701) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1645) > at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:297) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.MoveTask (state=08S01,code=1) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Transitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12478: Component/s: CBO > Improve Hive/Calcite Transitive Predicate inference > --- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.09.patch, HIVE-12478.10.patch, HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > {code:sql} > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Transitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12478: Description: HiveJoinPushTransitivePredicatesRule does not pull up predicates for transitive inference if they contain more than one column. {code:sql} EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); {code} was: HiveJoinPushTransitivePredicatesRule does not pull up predicates for transitive inference if they contain more than one column. EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); > Improve Hive/Calcite Transitive Predicate inference > --- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.09.patch, HIVE-12478.10.patch, HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > {code:sql} > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Transitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12478: Summary: Improve Hive/Calcite Transitive Predicate inference (was: Improve Hive/Calcite Trasitive Predicate inference) > Improve Hive/Calcite Transitive Predicate inference > --- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.09.patch, HIVE-12478.10.patch, HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12709) further improve user level explain
[ https://issues.apache.org/jira/browse/HIVE-12709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12709: Description: Need to address more feedbacks from Hive users for the user level explain: (1) Put stats in the same line as operator; (2) Avoid stats on *Sink; (3) Avoid col types; (4) TS should list pruned col names; (5) TS should list fully qualified table name, along with alias; etc was:Need to address more feedbacks from Hive users for the user level explain: (1) Put stats in the same line as operator; (2) Avoid stats on *Sink; (3) Avoid col types; (4) TS should list pruned col names; (5) TS should list fully qualified table name, along with alias; etc > further improve user level explain > -- > > Key: HIVE-12709 > URL: https://issues.apache.org/jira/browse/HIVE-12709 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 1.2.0 > > Attachments: HIVE-12709.01.patch > > > Need to address more feedbacks from Hive users for the user level explain: > (1) Put stats in the same line as operator; > (2) Avoid stats on *Sink; > (3) Avoid col types; > (4) TS should list pruned col names; > (5) TS should list fully qualified table name, along with alias; etc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12724) ACID: Major compaction fails to include the original bucket files into MR job
[ https://issues.apache.org/jira/browse/HIVE-12724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069368#comment-15069368 ] Damien Carol commented on HIVE-12724: - [~wzheng] I don't see *transactional* TBLPROPERTIES > ACID: Major compaction fails to include the original bucket files into MR job > - > > Key: HIVE-12724 > URL: https://issues.apache.org/jira/browse/HIVE-12724 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12724.1.patch, HIVE-12724.2.patch > > > How the problem happens: > * Create a non-ACID table > * Before non-ACID to ACID table conversion, we inserted row one > * After non-ACID to ACID table conversion, we inserted row two > * Both rows can be retrieved before MAJOR compaction > * After MAJOR compaction, row one is lost > {code} > hive> USE acidtest; > OK > Time taken: 0.77 seconds > hive> CREATE TABLE t1 (nationkey INT, name STRING, regionkey INT, comment > STRING) > > CLUSTERED BY (regionkey) INTO 2 BUCKETS > > STORED AS ORC; > OK > Time taken: 0.179 seconds > hive> DESC FORMATTED t1; > OK > # col_namedata_type comment > nationkey int > name string > regionkey int > comment string > # Detailed Table Information > Database: acidtest > Owner:wzheng > CreateTime: Mon Dec 14 15:50:40 PST 2015 > LastAccessTime: UNKNOWN > Retention:0 > Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1 > Table Type: MANAGED_TABLE > Table Parameters: > transient_lastDdlTime 1450137040 > # Storage Information > SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde > InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > Compressed: No > Num Buckets: 2 > Bucket Columns: [regionkey] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Time taken: 0.198 seconds, Fetched: 28 row(s) > hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db; > Found 1 items > drwxr-xr-x - wzheng staff 68 2015-12-14 15:50 > /Users/wzheng/hivetmp/warehouse/acidtest.db/t1 > hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1; > hive> INSERT INTO TABLE t1 VALUES (1, 'USA', 1, 'united states'); > WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the > future versions. Consider using a different execution engine (i.e. tez, > spark) or using Hive 1.X releases. > Query ID = wzheng_20151214155028_630098c6-605f-4e7e-a797-6b49fb48360d > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks determined at compile time: 2 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapreduce.job.reduces= > Job running in-process (local Hadoop) > 2015-12-14 15:51:58,070 Stage-1 map = 100%, reduce = 100% > Ended Job = job_local73977356_0001 > Loading data to table acidtest.t1 > MapReduce Jobs Launched: > Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 SUCCESS > Total MapReduce CPU Time Spent: 0 msec > OK > Time taken: 2.825 seconds > hive> dfs -ls /Users/wzheng/hivetmp/warehouse/acidtest.db/t1; > Found 2 items > -rwxr-xr-x 1 wzheng staff112 2015-12-14 15:51 > /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/00_0 > -rwxr-xr-x 1 wzheng staff472 2015-12-14 15:51 > /Users/wzheng/hivetmp/warehouse/acidtest.db/t1/01_0 > hive> SELECT * FROM t1; > OK > 1 USA 1 united states > Time taken: 0.434 seconds, Fetched: 1 row(s) > hive> ALTER TABLE t1 SET TBLPROPERTIES ('transactional' = 'true'); > OK > Time taken: 0.071 seconds > hive> DESC FORMATTED t1; > OK > # col_namedata_type comment > nationkey int > name string > regionkey int > comment string > # Detailed Table Information > Database: acidtest > Owner:wzheng > CreateTime: Mon Dec 14 15:50:40 PST 2015 > LastAccessTime: UNKNOWN > Retention:0 > Location: file:/Users/wzheng/hivetmp/warehouse/acidtest.db/t1 > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE false > last_modified_bywzheng > last_modified_time 1450137141 > numFiles2 > numRows -1 > rawDataSize -1 >
[jira] [Updated] (HIVE-10498) LLAP: Resolve everything in llap-daemon-site.xml
[ https://issues.apache.org/jira/browse/HIVE-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10498: Component/s: llap > LLAP: Resolve everything in llap-daemon-site.xml > > > Key: HIVE-10498 > URL: https://issues.apache.org/jira/browse/HIVE-10498 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Gopal V >Assignee: Gopal V > Fix For: llap > > Attachments: HIVE-10498.patch > > > Configuring a sequence of hadoop execution parameters via llap-daemon-site.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10981) LLAP: Accept --hiveconf parameters for the LlapServiceDriver
[ https://issues.apache.org/jira/browse/HIVE-10981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10981: Component/s: llap > LLAP: Accept --hiveconf parameters for the LlapServiceDriver > > > Key: HIVE-10981 > URL: https://issues.apache.org/jira/browse/HIVE-10981 > Project: Hive > Issue Type: Sub-task > Components: CLI, llap >Affects Versions: llap >Reporter: Gopal V >Assignee: Gopal V > Fix For: llap > > Attachments: HIVE-10981.1.patch > > > {code} > Exception in thread "main" org.apache.commons.cli.UnrecognizedOptionException: > Unrecognized option: --hiveconf > at > org.apache.commons.cli.Parser.processOption(Parser.java:363) > at > org.apache.commons.cli.Parser.parse(Parser.java:199) > at > org.apache.commons.cli.Parser.parse(Parser.java:85) > at > > org.apache.hadoop.hive.llap.cli.LlapOptionsProcessor.processOptions(LlapOptionsProcessor.java:137) > at > > org.apache.hadoop.hive.llap.cli.LlapServiceDriver.run(LlapServiceDriver.java:92) > at > > org.apache.hadoop.hive.llap.cli.LlapServiceDriver.main(LlapServiceDriver.java:58) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10218) LLAP: Loglevel for daemons as a startup option
[ https://issues.apache.org/jira/browse/HIVE-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10218: Component/s: llap > LLAP: Loglevel for daemons as a startup option > -- > > Key: HIVE-10218 > URL: https://issues.apache.org/jira/browse/HIVE-10218 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Gopal V >Assignee: Gopal V >Priority: Trivial > Fix For: llap > > Attachments: HIVE-10218.patch > > > Accept {{hive --service llap --loglevel WARN}} as a startup option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow
[ https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10474: Component/s: llap > LLAP: investigate why TPCH Q1 1k is slow > > > Key: HIVE-10474 > URL: https://issues.apache.org/jira/browse/HIVE-10474 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Sergey Shelukhin > Attachments: llap-gc-pauses.png > > > While most queries run faster in LLAP than just Tez with container reuse, > TPCH Q1 is much slower. > On my run, on tez with container reuse (current default LLAP configuration > but mode == container and no daemons running) runs 2-6 (out of 6 consecutive > runs in the same session) finished in 25.5sec average; with 16 LLAP daemons > in default config the average was 35.5sec; same w/o IO elevator (to rule out > its impact) it took 59.7sec w/strange distribution (later runs were slower > than earlier runs, still, fastest run was 49.5sec). > So excluding IO elevator it's more than 2x degradation. > We need to figure out why this is happening. Is it just slot discrepancy? > Regardless, this needs to be addressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12623) Add an option to force allocation of fragments on requested nodes
[ https://issues.apache.org/jira/browse/HIVE-12623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12623: Component/s: llap > Add an option to force allocation of fragments on requested nodes > - > > Key: HIVE-12623 > URL: https://issues.apache.org/jira/browse/HIVE-12623 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12623.1.wip.txt > > > Currently, fragments are sent to random nodes if the requested node does not > have capacity. In certain situations there's more to be gained by sending the > fragments to the requested node only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11524) LLAP: tez.runtime.compress doesn't appear to be honored for LLAP
[ https://issues.apache.org/jira/browse/HIVE-11524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11524: Component/s: llap > LLAP: tez.runtime.compress doesn't appear to be honored for LLAP > > > Key: HIVE-11524 > URL: https://issues.apache.org/jira/browse/HIVE-11524 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Sergey Shelukhin >Assignee: Siddharth Seth > Fix For: llap > > > When running llap on an openstack cluster without snappy installed, with > tez.runtime.compress set to false and codec set to snappy, one still gets the > exceptions due to snappy codec being absent: > {noformat} > 2015-08-10 11:14:30,440 > [TezTaskRunner_attempt_1438943112941_0015_2_00_00_0(attempt_1438943112941_0015_2_00_00_0)] > ERROR org.apache.hadoop.io.compress.snappy.SnappyCompressor: failed to load > SnappyCompressor > java.lang.NoSuchFieldError: clazz > at org.apache.hadoop.io.compress.snappy.SnappyCompressor.initIDs(Native > Method) > at > org.apache.hadoop.io.compress.snappy.SnappyCompressor.(SnappyCompressor.java:57) > at > org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:69) > at > org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:134) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:150) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:165) > at > org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.(IFile.java:153) > at > org.apache.tez.runtime.library.common.sort.impl.IFile$Writer.(IFile.java:138) > at > org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter$SpillCallable.callInternal(UnorderedPartitionedKVWriter.java:406) > at > org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter$SpillCallable.callInternal(UnorderedPartitionedKVWriter.java:367) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.finalSpill(UnorderedPartitionedKVWriter.java:612) > at > org.apache.tez.runtime.library.common.writers.UnorderedPartitionedKVWriter.close(UnorderedPartitionedKVWriter.java:521) > at > org.apache.tez.runtime.library.output.UnorderedKVOutput.close(UnorderedKVOutput.java:128) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:376) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:79) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1655) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > When it's set to true, the client complains about snappy. When it's set to > fails, the client doesn't complain but it still tries to use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12696) LlapServiceDriver can fail if only the packaged logger config is present
[ https://issues.apache.org/jira/browse/HIVE-12696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12696: Component/s: llap > LlapServiceDriver can fail if only the packaged logger config is present > > > Key: HIVE-12696 > URL: https://issues.apache.org/jira/browse/HIVE-12696 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Sergey Shelukhin >Priority: Minor > > I was incrementally updating my setup on some VM and didn't have the logger > config file, so the packaged one was picked up apparently, which caused this: > {noformat} > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: > jar:file:/home/vagrant/llap/apache-hive-2.0.0-SNAPSHOT-bin/lib/hive-llap-server-2.0.0-SNAPSHOT.jar!/llap-daemon-log4j2.properties > at org.apache.hadoop.fs.Path.initialize(Path.java:205) > at org.apache.hadoop.fs.Path.(Path.java:171) > at > org.apache.hadoop.hive.llap.cli.LlapServiceDriver.run(LlapServiceDriver.java:234) > at > org.apache.hadoop.hive.llap.cli.LlapServiceDriver.main(LlapServiceDriver.java:58) > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > jar:file:/home/vagrant/llap/apache-hive-2.0.0-SNAPSHOT-bin/lib/hive-llap-server-2.0.0-SNAPSHOT.jar!/llap-daemon-log4j2.properties > at java.net.URI.checkPath(URI.java:1823) > at java.net.URI.(URI.java:745) > at org.apache.hadoop.fs.Path.initialize(Path.java:202) > ... 3 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12699) LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work
[ https://issues.apache.org/jira/browse/HIVE-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12699: Component/s: llap > LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work > --- > > Key: HIVE-12699 > URL: https://issues.apache.org/jira/browse/HIVE-12699 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Trivial > Attachments: HIVE-12699.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12685) Remove invalid property in common/src/test/resources/hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12685: Component/s: Configuration > Remove invalid property in common/src/test/resources/hive-site.xml > -- > > Key: HIVE-12685 > URL: https://issues.apache.org/jira/browse/HIVE-12685 > Project: Hive > Issue Type: Bug > Components: Configuration >Affects Versions: 2.0.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12685.1.patch, HIVE-12685.2.patch, > HIVE-12685.3.patch > > > Currently there's such a property as below, which is obviously wrong > {code} > > javax.jdo.option.ConnectionDriverName > hive-site.xml > Override ConfVar defined in HiveConf > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12684) NPE in stats annotation when all values in decimal column are NULLs
[ https://issues.apache.org/jira/browse/HIVE-12684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12684: Component/s: Statistics > NPE in stats annotation when all values in decimal column are NULLs > --- > > Key: HIVE-12684 > URL: https://issues.apache.org/jira/browse/HIVE-12684 > Project: Hive > Issue Type: Bug > Components: Statistics >Affects Versions: 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12684.1.patch, HIVE-12684.2.patch, > HIVE-12684.3.patch, HIVE-12684.3.patch > > > When all column values are null for a decimal column and when column stats > exists. AnnotateWithStatistics optimization can throw NPE. Following is the > exception trace > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:712) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:764) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:750) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:197) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:143) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:131) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:114) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:228) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10156) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:225) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12701) select on table with boolean as partition column shows wrong result
[ https://issues.apache.org/jira/browse/HIVE-12701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12701: Description: {code:sql} create table hive_aprm02ht7(a int, b int, c int) partitioned by (p boolean) row format delimited fields terminated by ',' stored as textfile; load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition (p=true); load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition (p=false); describe hive_aprm02ht7; {code} {noformat} col_namedata_type comment a int b int c int p boolean # Partition Information # col_name data_type comment p boolean {noformat} {code:sql} show partitions hive_aprm02ht7; {code} {noformat} OK p=false p=true Time taken: 0.057 seconds, Fetched: 2 row(s) {noformat} -- everything is shown as true. But first three should be true and the last three rows should be false {noformat} hive> select * from hive_aprm02ht7 where p in (true,false); OK 1 2 3 true 4 5 6 true 7 8 9 true 1 2 3 true 4 5 6 true 7 8 9 true Time taken: 0.068 seconds, Fetched: 6 row(s) {noformat} was: create table hive_aprm02ht7(a int, b int, c int) partitioned by (p boolean) row format delimited fields terminated by ',' stored as textfile; load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition (p=true); load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition (p=false); describe hive_aprm02ht7; col_namedata_type comment a int b int c int p boolean # Partition Information # col_name data_type comment p boolean show partitions hive_aprm02ht7; OK p=false p=true Time taken: 0.057 seconds, Fetched: 2 row(s) -- everything is shown as true. But first three should be true and the last three rows should be false hive> select * from hive_aprm02ht7 where p in (true,false); OK 1 2 3 true 4 5 6 true 7 8 9 true 1 2 3 true 4 5 6 true 7 8 9 true Time taken: 0.068 seconds, Fetched: 6 row(s) > select on table with boolean as partition column shows wrong result > --- > > Key: HIVE-12701 > URL: https://issues.apache.org/jira/browse/HIVE-12701 > Project: Hive > Issue Type: Bug > Components: Database/Schema, SQL >Affects Versions: 1.1.0 >Reporter: Sudipto Nandan >Assignee: Chinna Rao Lalam > > {code:sql} > create table hive_aprm02ht7(a int, b int, c int) partitioned by (p boolean) > row format delimited fields terminated by ',' stored as textfile; > load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition > (p=true); > load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition > (p=false); > describe hive_aprm02ht7; > {code} > {noformat} > col_namedata_type comment > a int > b int > c int > p boolean > # Partition Information > # col_name data_type comment > p boolean > {noformat} > {code:sql} > show partitions hive_aprm02ht7; > {code} > {noformat} > OK > p=false > p=true > Time taken: 0.057 seconds, Fetched: 2 row(s) > {noformat} > -- everything is shown as true. But first three should be true and the last > three rows should be false > {noformat} > hive> select * from hive_aprm02ht7 where p in (true,false); > OK > 1 2 3 true > 4 5 6 true > 7 8 9 true > 1 2 3 true > 4 5 6 true > 7 8 9 true > Time taken: 0.068 seconds, Fetched: 6 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12220) LLAP: Usability issues with hive.llap.io.cache.orc.size
[ https://issues.apache.org/jira/browse/HIVE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12220: Component/s: (was: Hive) llap > LLAP: Usability issues with hive.llap.io.cache.orc.size > --- > > Key: HIVE-12220 > URL: https://issues.apache.org/jira/browse/HIVE-12220 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Carter Shanklin >Assignee: Sergey Shelukhin > Attachments: HIVE-12220.01.patch, HIVE-12220.patch > > > In the llap-daemon site you need to set, among other things, > llap.daemon.memory.per.instance.mb > and > hive.llap.io.cache.orc.size > The use of hive.llap.io.cache.orc.size caused me some unnecessary problems, > initially I entered the value in MB rather than in bytes. Operator error you > could say but I look at this as a fraction of the other value which is in mb. > Second, is this really tied to ORC? E.g. when we have the vectorized text > reader will this data be cached as well? Or might it be in the future? > I would like to propose instead using hive.llap.io.cache.size.mb for this > setting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11245) LLAP: Fix the LLAP to ORC APIs
[ https://issues.apache.org/jira/browse/HIVE-11245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11245: Component/s: llap > LLAP: Fix the LLAP to ORC APIs > -- > > Key: HIVE-11245 > URL: https://issues.apache.org/jira/browse/HIVE-11245 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Owen O'Malley >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: llap > > > Currently the LLAP branch has refactored the ORC code to have different code > paths depending on whether the data is coming from the cache or a FileSystem. > We need to introduce a concept of a DataSource that is responsible for > getting the necessary bytes regardless of whether they are coming from a > FileSystem, in memory cache, or both. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9986) LLAP: EOFException in reader
[ https://issues.apache.org/jira/browse/HIVE-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9986: --- Component/s: llap > LLAP: EOFException in reader > > > Key: HIVE-9986 > URL: https://issues.apache.org/jira/browse/HIVE-9986 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Gopal V >Assignee: Sergey Shelukhin > Fix For: llap > > > From HIVE-9979 > {noformat} > 2015-03-16 10:20:51,439 > [pool-2-thread-3(container_1_1141_01_000192_gopal_20150316102020_c8c92488-6a61-401e-8298-401dace286dc:1_Map > 1_191_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Getting > data for column 9 RG 112 stream DATA at 62278935, 1057137 index position 0: > compressed [62614934, 63139228) > 2015-03-16 10:20:51,439 > [pool-2-thread-6(container_1_1141_01_000211_gopal_20150316102020_c8c92488-6a61-401e-8298-401dace286dc:1_Map > 1_210_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Getting > stripe-level stream [LENGTH, kind: DICTIONARY_V2 > dictionarySize: 3 > ] for column 9 RG 91 at 64139927, 5 > ... > Caused by: java.io.EOFException > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderUtils.readDirect(RecordReaderUtils.java:286) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:266) > at > org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:234) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:280) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:44) > at > org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) > ... 4 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10113) LLAP: reducers running in LLAP starve out map retries
[ https://issues.apache.org/jira/browse/HIVE-10113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10113: Component/s: llap > LLAP: reducers running in LLAP starve out map retries > - > > Key: HIVE-10113 > URL: https://issues.apache.org/jira/browse/HIVE-10113 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Sergey Shelukhin >Assignee: Siddharth Seth > Fix For: llap > > > When query 17 is run, some mappers from Map 1 currently fail (due to unwrap > issue, and also due to HIVE-10112). > This query has 1000+ reducers; if they are ran in llap, they all queue up, > and the query locks up. > If only mappers run in LLAP, query completes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12695) LLAP: use somebody else's cluster
[ https://issues.apache.org/jira/browse/HIVE-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12695: Component/s: llap > LLAP: use somebody else's cluster > - > > Key: HIVE-12695 > URL: https://issues.apache.org/jira/browse/HIVE-12695 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12695.patch > > > For non-HS2 case cluster sharing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12625) Backport to branch-1 HIVE-11981 ORC Schema Evolution Issues (Vectorized, ACID, and Non-Vectorized)
[ https://issues.apache.org/jira/browse/HIVE-12625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12625: Component/s: (was: Hive) ORC > Backport to branch-1 HIVE-11981 ORC Schema Evolution Issues (Vectorized, > ACID, and Non-Vectorized) > -- > > Key: HIVE-12625 > URL: https://issues.apache.org/jira/browse/HIVE-12625 > Project: Hive > Issue Type: Bug > Components: ORC >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12625.1-branch1.patch, HIVE-12625.2-branch1.patch, > HIVE-12625.3-branch1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12632) LLAP: don't use IO elevator for ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12632: Component/s: llap > LLAP: don't use IO elevator for ACID tables > > > Key: HIVE-12632 > URL: https://issues.apache.org/jira/browse/HIVE-12632 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-12632.patch > > > Until HIVE-12631 is fixed, we need to avoid ACID tables in IO elevator. Right > now, a FileNotFound error is thrown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12598) LLAP: disable fileId when not supported
[ https://issues.apache.org/jira/browse/HIVE-12598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12598: Component/s: llap > LLAP: disable fileId when not supported > --- > > Key: HIVE-12598 > URL: https://issues.apache.org/jira/browse/HIVE-12598 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12598.01.patch, HIVE-12598.02.patch, > HIVE-12598.patch > > > There is a TODO somewhere in code. We might get a synthetic fileId in absence > of the real one in some cases when another FS masquerades as HDFS, we should > be able to turn off fileID support explicitly for such cases as they are not > bulletproof. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12422) LLAP: add security to Web UI endpoint
[ https://issues.apache.org/jira/browse/HIVE-12422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12422: Component/s: llap > LLAP: add security to Web UI endpoint > - > > Key: HIVE-12422 > URL: https://issues.apache.org/jira/browse/HIVE-12422 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12422.01.patch, HIVE-12422.02.patch, > HIVE-12422.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12596) Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
[ https://issues.apache.org/jira/browse/HIVE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044870#comment-15044870 ] Damien Carol commented on HIVE-12596: - There are NO PROBLEM with version 1.2.1. Also your date is wrong => '2014-*{color:red}15{color}*-16 17:18:19.20' > Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp > format must be -mm-dd hh:mm:ss[.f] > > > Key: HIVE-12596 > URL: https://issues.apache.org/jira/browse/HIVE-12596 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Takahiko Saito > > Run the below: > {noformat} > create table test_acid( i int, ts timestamp) > clustered by (i) into 2 buckets > stored as orc > tblproperties ('transactional'='true'); > insert into table test_acid values (1, '2014-09-14 12:34:30'); > delete from test_acid where ts = '2014-15-16 17:18:19.20'; > {noformat} > The below error is thrown: > {noformat} > 15/12/04 19:55:49 INFO SessionState: Map 1: -/- Reducer 2: 0/2 > Status: Failed > 15/12/04 19:55:49 ERROR SessionState: Status: Failed > Vertex failed, vertexName=Map 1, vertexId=vertex_1447960616881_0022_2_00, > diagnostics=[Vertex vertex_1447960616881_0022_2_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: test_acid initializer failed, > vertex=vertex_1447960616881_0022_2_00 [Map 1], > java.lang.IllegalArgumentException: Timestamp format must be -mm-dd > hh:mm:ss[.f] > at java.sql.Timestamp.valueOf(Timestamp.java:237) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.boxLiteral(ConvertAstToSearchArg.java:160) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.findLiteral(ConvertAstToSearchArg.java:191) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:268) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:326) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.parse(ConvertAstToSearchArg.java:377) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.(ConvertAstToSearchArg.java:68) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.create(ConvertAstToSearchArg.java:417) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createFromConf(ConvertAstToSearchArg.java:436) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.(OrcInputFormat.java:484) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1121) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1207) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:369) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:481) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:160) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Not sure if this change is intended as the issue is not seen with ver. 1.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12596) Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
[ https://issues.apache.org/jira/browse/HIVE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol reassigned HIVE-12596: --- Assignee: Damien Carol > Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp > format must be -mm-dd hh:mm:ss[.f] > > > Key: HIVE-12596 > URL: https://issues.apache.org/jira/browse/HIVE-12596 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Damien Carol > > Run the below: > {noformat} > create table test_acid( i int, ts timestamp) > clustered by (i) into 2 buckets > stored as orc > tblproperties ('transactional'='true'); > insert into table test_acid values (1, '2014-09-14 12:34:30'); > delete from test_acid where ts = '2014-15-16 17:18:19.20'; > {noformat} > The below error is thrown: > {noformat} > 15/12/04 19:55:49 INFO SessionState: Map 1: -/- Reducer 2: 0/2 > Status: Failed > 15/12/04 19:55:49 ERROR SessionState: Status: Failed > Vertex failed, vertexName=Map 1, vertexId=vertex_1447960616881_0022_2_00, > diagnostics=[Vertex vertex_1447960616881_0022_2_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: test_acid initializer failed, > vertex=vertex_1447960616881_0022_2_00 [Map 1], > java.lang.IllegalArgumentException: Timestamp format must be -mm-dd > hh:mm:ss[.f] > at java.sql.Timestamp.valueOf(Timestamp.java:237) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.boxLiteral(ConvertAstToSearchArg.java:160) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.findLiteral(ConvertAstToSearchArg.java:191) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:268) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:326) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.parse(ConvertAstToSearchArg.java:377) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.(ConvertAstToSearchArg.java:68) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.create(ConvertAstToSearchArg.java:417) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createFromConf(ConvertAstToSearchArg.java:436) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.(OrcInputFormat.java:484) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1121) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1207) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:369) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:481) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:160) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Not sure if this change is intended as the issue is not seen with ver. 1.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12596) Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp format must be yyyy-mm-dd hh:mm:ss[.fffffffff]
[ https://issues.apache.org/jira/browse/HIVE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044873#comment-15044873 ] Damien Carol commented on HIVE-12596: - I don't see any pb here. version 1.2.1 doesn't throw an error BUT the date is wrong. version master seems to throw an error that telling that the date is not ok BUT the date is not ok :D > Delete timestamp row throws java.lang.IllegalArgumentException: Timestamp > format must be -mm-dd hh:mm:ss[.f] > > > Key: HIVE-12596 > URL: https://issues.apache.org/jira/browse/HIVE-12596 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Takahiko Saito > > Run the below: > {noformat} > create table test_acid( i int, ts timestamp) > clustered by (i) into 2 buckets > stored as orc > tblproperties ('transactional'='true'); > insert into table test_acid values (1, '2014-09-14 12:34:30'); > delete from test_acid where ts = '2014-15-16 17:18:19.20'; > {noformat} > The below error is thrown: > {noformat} > 15/12/04 19:55:49 INFO SessionState: Map 1: -/- Reducer 2: 0/2 > Status: Failed > 15/12/04 19:55:49 ERROR SessionState: Status: Failed > Vertex failed, vertexName=Map 1, vertexId=vertex_1447960616881_0022_2_00, > diagnostics=[Vertex vertex_1447960616881_0022_2_00 [Map 1] killed/failed due > to:ROOT_INPUT_INIT_FAILURE, Vertex Input: test_acid initializer failed, > vertex=vertex_1447960616881_0022_2_00 [Map 1], > java.lang.IllegalArgumentException: Timestamp format must be -mm-dd > hh:mm:ss[.f] > at java.sql.Timestamp.valueOf(Timestamp.java:237) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.boxLiteral(ConvertAstToSearchArg.java:160) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.findLiteral(ConvertAstToSearchArg.java:191) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:268) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createLeaf(ConvertAstToSearchArg.java:326) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.parse(ConvertAstToSearchArg.java:377) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.(ConvertAstToSearchArg.java:68) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.create(ConvertAstToSearchArg.java:417) > at > org.apache.hadoop.hive.ql.io.sarg.ConvertAstToSearchArg.createFromConf(ConvertAstToSearchArg.java:436) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$Context.(OrcInputFormat.java:484) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1121) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1207) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:369) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:481) > at > org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:160) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240) > at > org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Not sure if this change is intended as the issue is not seen with ver. 1.2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0
[ https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12429: Description: Hive's default authorization is not real security, as it does not secure a number of features and anyone can grant access to any object to any user. We should switch the default to SQLStandardAuth, which provides real authentication. As this is a backwards incompatible change this was hard to do previously, but 2.0 gives us a place to do this type of change. By default authorization will still be off, as there are a few other things to set when turning on authorization (such as the list of admin users). was: Hive's default authorization is not real security, as it does not secure a number of features and anyone can grant access to any object to any user. We should switch the default o SQLStandardAuth, which provides real authentication. As this is a backwards incompatible change this was hard to do previously, but 2.0 gives us a place to do this type of change. By default authorization will still be off, as there are a few other things to set when turning on authorization (such as the list of admin users). > Switch default Hive authorization to SQLStandardAuth in 2.0 > --- > > Key: HIVE-12429 > URL: https://issues.apache.org/jira/browse/HIVE-12429 > Project: Hive > Issue Type: Task > Components: Authorization, Security >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Daniel Dai > Attachments: HIVE-12429.1.patch, HIVE-12429.2.patch > > > Hive's default authorization is not real security, as it does not secure a > number of features and anyone can grant access to any object to any user. We > should switch the default to SQLStandardAuth, which provides real > authentication. > As this is a backwards incompatible change this was hard to do previously, > but 2.0 gives us a place to do this type of change. > By default authorization will still be off, as there are a few other things > to set when turning on authorization (such as the list of admin users). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12567) Enhance TxnHandler retry logic to handle ORA-08176
[ https://issues.apache.org/jira/browse/HIVE-12567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12567: Component/s: Metastore > Enhance TxnHandler retry logic to handle ORA-08176 > -- > > Key: HIVE-12567 > URL: https://issues.apache.org/jira/browse/HIVE-12567 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12567.1.patch, HIVE-12567.patch > > > {noformat} > FAILED: Error in acquiring locks: Error communicating with the metastore > 2015-12-01 09:19:32,459 ERROR [HiveServer2-Background-Pool: Thread-55]: > ql.Driver (SessionState.java:printError(932)) - FAILED: Error in acquiring > locks: Error communicating with the metastore > org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the > metastore > at > org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:132) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:227) > at > org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:92) > at > org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:1029) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1226) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1100) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) > at > org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) > at > org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: MetaException(message:Unable to update transaction database > java.sql.SQLException: ORA-08176: consistent read failure; rollback data not > available > at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450) > at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399) > at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1059) > at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:522) > at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257) > at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587) > at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:210) > at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:30) > at > oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:762) > at > oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:925) > at > oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:) > at > oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1309) > at > oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:422) > at > com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.getLockInfoFromLockId(TxnHandler.java:1951) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:1600) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1576) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:480) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5586) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) >
[jira] [Updated] (HIVE-7214) Support predicate pushdown for complex data types in ORCFile
[ https://issues.apache.org/jira/browse/HIVE-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-7214: --- Component/s: ORC > Support predicate pushdown for complex data types in ORCFile > > > Key: HIVE-7214 > URL: https://issues.apache.org/jira/browse/HIVE-7214 > Project: Hive > Issue Type: Improvement > Components: File Formats, ORC >Reporter: Rohini Palaniswamy > Labels: ORC > > Currently ORCFile does not support predicate pushdown for complex datatypes > like map, array and struct while Parquet does. Came across this during > discussion of PIG-3760. Our users have a lot of map and struct (tuple in pig) > columns and most of the filter conditions are on them. Would be great to have > support added for them in ORC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12533) Unexpected NULL in map join small table
[ https://issues.apache.org/jira/browse/HIVE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030507#comment-15030507 ] Damien Carol commented on HIVE-12533: - What version ? [~rajesh.balamohan] > Unexpected NULL in map join small table > --- > > Key: HIVE-12533 > URL: https://issues.apache.org/jira/browse/HIVE-12533 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Rajesh Balamohan > > {noformat} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL in map join > small table > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:174) > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:170) > at > org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) > ... 5 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL > in map join small table > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:88) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:182) > at > org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:97) > ... 9 more > {noformat} > \cc [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030449#comment-15030449 ] Damien Carol commented on HIVE-11107: - Profile _hadoop-2_ will be removed from master branch. I think you should change description on this JIRA. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work
[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12537: Description: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. {code} ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector( Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), OrcFile.writerOptions(new Configuration()) .compress(CompressionKind.NONE) .inspector(inspector) .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) .version(OrcFile.Version.V_0_12) ); for (int i = 0; i < 100; ++i) { w.addRow(123); } w.close(); {code} was: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. > RLEv2 doesn't seem to work > -- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats, ORC >Affects Versions: 1.2.1 >Reporter: Bogdan Raducanu > Labels: orc, orcfile > Attachments: Main.java, orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. > {code} > ObjectInspector inspector = > ObjectInspectorFactory.getReflectionObjectInspector( > Integer.class, > ObjectInspectorFactory.ObjectInspectorOptions.JAVA); > Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), > OrcFile.writerOptions(new Configuration()) > .compress(CompressionKind.NONE) > .inspector(inspector) > > .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) > .version(OrcFile.Version.V_0_12) > ); > > for (int i = 0; i < 100; ++i) { > w.addRow(123); > } > w.close(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work
[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12537: Description: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. {code} ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector( Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), OrcFile.writerOptions(new Configuration()) .compress(CompressionKind.NONE) .inspector(inspector) .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) .version(OrcFile.Version.V_0_12) ); for (int i = 0; i < 100; ++i) { w.addRow(123); } w.close(); {code} was: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. {code} ObjectInspector inspector = ObjectInspectorFactory.getReflectionObjectInspector( Integer.class, ObjectInspectorFactory.ObjectInspectorOptions.JAVA); Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), OrcFile.writerOptions(new Configuration()) .compress(CompressionKind.NONE) .inspector(inspector) .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) .version(OrcFile.Version.V_0_12) ); for (int i = 0; i < 100; ++i) { w.addRow(123); } w.close(); {code} > RLEv2 doesn't seem to work > -- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats, ORC >Affects Versions: 1.2.1 >Reporter: Bogdan Raducanu > Labels: orc, orcfile > Attachments: Main.java, orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. > {code} > ObjectInspector inspector = > ObjectInspectorFactory.getReflectionObjectInspector( > Integer.class, > ObjectInspectorFactory.ObjectInspectorOptions.JAVA); > Writer w = OrcFile.createWriter(new Path("/tmp/my.orc"), > OrcFile.writerOptions(new Configuration()) > .compress(CompressionKind.NONE) > .inspector(inspector) > > .encodingStrategy(OrcFile.EncodingStrategy.COMPRESSION) > .version(OrcFile.Version.V_0_12) > ); > for (int i = 0; i < 100; ++i) { > w.addRow(123); > } > w.close(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12536) Cannot handle dash (-) on the metastore database name
[ https://issues.apache.org/jira/browse/HIVE-12536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12536: Description: If you setup a database for metastore with a dash in its name (eg, apache-hive) hive client fails. Here is the db connection string. The database is apache-hive {code:xml} javax.jdo.option.ConnectionURL jdbc:mysql://10.0.3.166/apache-hive JDBC connect string for a JDBC metastore {code} Here is the exception you get when staring hive: {noformat} root@jackal-local-machine-4:/home/ubuntu/resources/hive-x86_64# su hive -c hive {noformat} {noformat} Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-1.2.1.jar!/hive-log4j.properties Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) ... 7 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 13 more Caused by: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-hive.`SEQUENCE_TABLE` WHERE `SEQUENCE_NAME`='org.apache.hadoop.hive.metastore.m' at line 1 NestedThrowables: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-hive.`SEQUENCE_TABLE` WHERE `SEQUENCE_NAME`='org.apache.hadoop.hive.metastore.m' at line 1 at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:732) at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752) at org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:521) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy5.createDatabase(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:604) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:624) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:66) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
[jira] [Updated] (HIVE-12537) RLEv2 doesn't seem to work
[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12537: Component/s: ORC > RLEv2 doesn't seem to work > -- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats, ORC >Affects Versions: 1.2.1 >Reporter: Bogdan Raducanu > Labels: orc, orcfile > Attachments: Main.java, orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12534) Date functions with vectorization is returning wrong results
[ https://issues.apache.org/jira/browse/HIVE-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12534: Component/s: Vectorization > Date functions with vectorization is returning wrong results > > > Key: HIVE-12534 > URL: https://issues.apache.org/jira/browse/HIVE-12534 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan > > {noformat} > select c.effective_date, year(c.effective_date), month(c.effective_date) from > customers c where c.customer_id = 146028; > hive> set hive.vectorized.execution.enabled=true; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 0 0 > hive> set hive.vectorized.execution.enabled=false; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 201511 > {noformat} > \cc [~gopalv], [~sseth], [~sershe] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12534) Date functions with vectorization is returning wrong results
[ https://issues.apache.org/jira/browse/HIVE-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12534: Component/s: (was: Hive) > Date functions with vectorization is returning wrong results > > > Key: HIVE-12534 > URL: https://issues.apache.org/jira/browse/HIVE-12534 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan > > {noformat} > select c.effective_date, year(c.effective_date), month(c.effective_date) from > customers c where c.customer_id = 146028; > hive> set hive.vectorized.execution.enabled=true; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 0 0 > hive> set hive.vectorized.execution.enabled=false; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 201511 > {noformat} > \cc [~gopalv], [~sseth], [~sershe] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11890) Create ORC module
[ https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11890: Component/s: ORC > Create ORC module > - > > Key: HIVE-11890 > URL: https://issues.apache.org/jira/browse/HIVE-11890 > Project: Hive > Issue Type: Sub-task > Components: ORC >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, > HIVE-11890.patch > > > Start moving classes over to the ORC module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11531: Component/s: CBO > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > - > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement > Components: CBO >Reporter: Sergey Shelukhin >Assignee: Hui Zheng > Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, > HIVE-11531.WIP.2.patch, HIVE-11531.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12055: Component/s: Shims ORC > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task > Components: ORC, Shims >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
[ https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-12498: Description: OrcRecordUpdater does not honor the OrcRecordUpdater.OrcOptions.tableProperties() setting. It would need to translate the specified tableProperties (as listed in OrcTableProperties enum) to the properties that OrcWriter internally understands (listed in HiveConf.ConfVars). This is needed for multiple clients.. like Streaming API and Compactor. {code:java} Properties orcTblProps = .. // get Orc Table Properties from MetaStore; AcidOutputFormat.Options updaterOptions = new OrcRecordUpdater.OrcOptions(conf) .inspector(..) .bucket(..) .minimumTransactionId(..) .maximumTransactionId(..) .tableProperties(orcTblProps); // <<== OrcOutputFormat orcOutput = new ... orcOutput.getRecordUpdater(partitionPath, updaterOptions ); {code} was: OrcRecordUpdater does not honor the OrcRecordUpdater.OrcOptions.tableProperties() setting. It would need to translate the specified tableProperties (as listed in OrcTableProperties enum) to the properties that OrcWriter internally understands (listed in HiveConf.ConfVars). This is needed for multiple clients.. like Streaming API and Compactor. {noformat} Properties orcTblProps = .. // get Orc Table Properties from MetaStore; AcidOutputFormat.Options updaterOptions = new OrcRecordUpdater.OrcOptions(conf) .inspector(..) .bucket(..) .minimumTransactionId(..) .maximumTransactionId(..) .tableProperties(orcTblProps); // <<== OrcOutputFormat orcOutput = new ... orcOutput.getRecordUpdater(partitionPath, updaterOptions ); {noformat} > ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect > - > > Key: HIVE-12498 > URL: https://issues.apache.org/jira/browse/HIVE-12498 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: ACID, ORC > Attachments: HIVE-12498.1.patch > > > OrcRecordUpdater does not honor the > OrcRecordUpdater.OrcOptions.tableProperties() setting. > It would need to translate the specified tableProperties (as listed in > OrcTableProperties enum) to the properties that OrcWriter internally > understands (listed in HiveConf.ConfVars). > This is needed for multiple clients.. like Streaming API and Compactor. > {code:java} > Properties orcTblProps = .. // get Orc Table Properties from MetaStore; > AcidOutputFormat.Options updaterOptions = new > OrcRecordUpdater.OrcOptions(conf) > .inspector(..) > .bucket(..) > .minimumTransactionId(..) > .maximumTransactionId(..) > > .tableProperties(orcTblProps); // <<== > OrcOutputFormat orcOutput = new ... > orcOutput.getRecordUpdater(partitionPath, updaterOptions ); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11774) Show macro definition for desc function
[ https://issues.apache.org/jira/browse/HIVE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738812#comment-14738812 ] Damien Carol commented on HIVE-11774: - [~navis] Should we had a {{DESC MACRO foo}} statement ?? > Show macro definition for desc function > > > Key: HIVE-11774 > URL: https://issues.apache.org/jira/browse/HIVE-11774 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-11774.1.patch.txt > > > Currently, desc function shows nothing for macro. It would be helpful if it > shows the definition of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11707) Implement "dump metastore"
[ https://issues.apache.org/jira/browse/HIVE-11707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730873#comment-14730873 ] Damien Carol commented on HIVE-11707: - This feature is a must have. We're using dump from MySQL stored in HDFS. This feature could make things a lot better. > Implement "dump metastore" > -- > > Key: HIVE-11707 > URL: https://issues.apache.org/jira/browse/HIVE-11707 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Navis >Assignee: Navis >Priority: Minor > > In projects, we've frequently met the need of copying existing metastore to > other database (for other version of hive or other engines like impala, tajo, > spark, etc.). RDBs support dumping data of metastore into series of SQLs but > it's needed to be translated before apply if we uses different RDB which is > time counsuming, error-prone work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11175) create function using jar does not work with sql std authorization
[ https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11175: Description: {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} gives error code for need of accessing a local foo.jar resource with ADMIN privileges. Same for HDFS (DFS_URI) . problem is that the semantic analysis enforces the ADMIN privilege for write but the jar is clearly input not output. Patch und Testcase appendend. was: {{create function xxx as 'xxx' using jar 'file://foo.jar' }} gives error code for need of accessing a local foo.jar resource with ADMIN privileges. Same for HDFS (DFS_URI) . problem is that the semantic analysis enforces the ADMIN privilege for write but the jar is clearly input not output. Patch und Testcase appendend. create function using jar does not work with sql std authorization -- Key: HIVE-11175 URL: https://issues.apache.org/jira/browse/HIVE-11175 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 1.2.0 Reporter: Olaf Flebbe Fix For: 2.0.0 Attachments: HIVE-11175.1.patch {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} gives error code for need of accessing a local foo.jar resource with ADMIN privileges. Same for HDFS (DFS_URI) . problem is that the semantic analysis enforces the ADMIN privilege for write but the jar is clearly input not output. Patch und Testcase appendend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10924) add support for MERGE statement
[ https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712986#comment-14712986 ] Damien Carol commented on HIVE-10924: - [~ekoifman] Any progress on this one? add support for MERGE statement --- Key: HIVE-10924 URL: https://issues.apache.org/jira/browse/HIVE-10924 Project: Hive Issue Type: New Feature Components: Query Planning, Query Processor Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman add support for MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11387) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization
[ https://issues.apache.org/jira/browse/HIVE-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11387: Description: The main problem is that, due to return path, now we may have {{(RS1-GBY2)\-(RS3-GBY4)}} when map.aggr=false, i.e., no map aggr. However, in the non-return path, it will be treated as {{(RS1)-(GBY2-RS3-GBY4)}}. The main problem is that it does not take into account of the setting. (was: {noformat} The main problem is that, due to return path, now we may have (RS1-GBY2)-(RS3-GBY4) when map.aggr=false, i.e., no map aggr. However, in the non-return path, it will be treated as (RS1)-(GBY2-RS3-GBY4). The main problem is that it does not take into account of the setting. {noformat}) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization -- Key: HIVE-11387 URL: https://issues.apache.org/jira/browse/HIVE-11387 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11387.01.patch, HIVE-11387.02.patch, HIVE-11387.03.patch, HIVE-11387.04.patch, HIVE-11387.05.patch, HIVE-11387.06.patch The main problem is that, due to return path, now we may have {{(RS1-GBY2)\-(RS3-GBY4)}} when map.aggr=false, i.e., no map aggr. However, in the non-return path, it will be treated as {{(RS1)-(GBY2-RS3-GBY4)}}. The main problem is that it does not take into account of the setting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10171) Create a storage-api module
[ https://issues.apache.org/jira/browse/HIVE-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10171: Description: To support high performance file formats, I'd like to propose that we move the minimal set of classes that are required to integrate with Hive into a new module named storage-api. This module will include VectorizedRowBatch, the various ColumnVector classes, and the SARG classes. It will form the start of an API that high performance storage formats can use to integrate with Hive. Both ORC and Parquet can use the new API to support vectorization and SARGs without performance destroying shims. (was: To support high performance file formats, I'd like to propose that we move the minimal set of classes that are required to integrate with Hive in to a new module named storage-api. This module will include VectorizedRowBatch, the various ColumnVector classes, and the SARG classes. It will form the start of an API that high performance storage formats can use to integrate with Hive. Both ORC and Parquet can use the new API to support vectorization and SARGs without performance destroying shims.) Create a storage-api module --- Key: HIVE-10171 URL: https://issues.apache.org/jira/browse/HIVE-10171 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 2.0.0 To support high performance file formats, I'd like to propose that we move the minimal set of classes that are required to integrate with Hive into a new module named storage-api. This module will include VectorizedRowBatch, the various ColumnVector classes, and the SARG classes. It will form the start of an API that high performance storage formats can use to integrate with Hive. Both ORC and Parquet can use the new API to support vectorization and SARGs without performance destroying shims. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11285) ObjectInspector for partition columns in FetchOperator in SMBJoin causes exception
[ https://issues.apache.org/jira/browse/HIVE-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11285: Description: STEPS TO REPRODUCE: {noformat} *$ cat data.out 1|One 2|Two {noformat} {code:sql} hql CREATE TABLE data_table (key INT, value STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; LOAD DATA LOCAL INPATH '${system:user.dir}/data.out' INTO TABLE data_table; CREATE TABLE smb_table (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 1 BUCKETS STORED AS ORC; CREATE TABLE smb_table_part (key INT, value STRING) PARTITIONED BY (p1 DECIMAL) CLUSTERED BY (key) SORTED BY (key) INTO 1 BUCKETS STORED AS ORC; INSERT OVERWRITE TABLE smb_table SELECT * FROM data_table; INSERT OVERWRITE TABLE smb_table_part PARTITION (p1) SELECT key, value, 100 as p1 FROM data_table; SET hive.execution.engine=mr; SET hive.enforce.sortmergebucketmapjoin=false; SET hive.auto.convert.sortmerge.join=true; SET hive.optimize.bucketmapjoin = true; SET hive.optimize.bucketmapjoin.sortedmerge = true; SET hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; SELECT s1.key, s2.p1 FROM smb_table s1 INNER JOIN smb_table_part s2 ON s1.key = s2.key ORDER BY s1.key; {code} ERROR: {noformat} 2015-07-15 13:39:04,333 WARN main org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:1,value:One} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:1,value:One} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176) ... 8 more Caused by: java.lang.RuntimeException: Map local work failed at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.fetchOneRow(SMBMapJoinOperator.java:569) at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.fetchNextGroup(SMBMapJoinOperator.java:429) at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:260) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to java.lang.Integer at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.getPrimitiveWritableObject(JavaIntObjectInspector.java:35) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:305) at org.apache.hadoop.hive.ql.exec.JoinUtil.computeValues(JoinUtil.java:193) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilteredValue(CommonJoinOperator.java:408) at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.processOp(SMBMapJoinOperator.java:270) at org.apache.hadoop.hive.ql.exec.SMBMapJoinOperator.fetchOneRow(SMBMapJoinOperator.java:558) ... 17 more {noformat} was: {code} STEPS TO REPRODUCE: *$ cat data.out 1|One 2|Two hql CREATE TABLE data_table (key INT, value STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; LOAD DATA LOCAL INPATH '$ {system:user.dir} /data.out' INTO TABLE data_table; CREATE TABLE smb_table (key INT, value STRING) CLUSTERED BY (key) SORTED BY (key) INTO 1 BUCKETS STORED AS ORC; CREATE TABLE smb_table_part (key INT, value STRING) PARTITIONED BY (p1 DECIMAL) CLUSTERED BY (key) SORTED BY (key) INTO 1 BUCKETS STORED AS ORC; INSERT OVERWRITE TABLE smb_table SELECT * FROM data_table; INSERT OVERWRITE TABLE smb_table_part PARTITION (p1) SELECT key, value, 100 as p1 FROM data_table; SET hive.execution.engine=mr; SET hive.enforce.sortmergebucketmapjoin=false; SET hive.auto.convert.sortmerge.join=true; SET hive.optimize.bucketmapjoin = true; SET hive.optimize.bucketmapjoin.sortedmerge = true; SET hive.input.format =
[jira] [Commented] (HIVE-11247) hive升级遇到问题
[ https://issues.apache.org/jira/browse/HIVE-11247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626284#comment-14626284 ] Damien Carol commented on HIVE-11247: - [~41358796] could you translate it in english? hive升级遇到问题 -- Key: HIVE-11247 URL: https://issues.apache.org/jira/browse/HIVE-11247 Project: Hive Issue Type: Bug Reporter: hongyan Assignee: hongyan Priority: Critical 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11247) hive升级遇到问题
[ https://issues.apache.org/jira/browse/HIVE-11247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11247: Description: 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 {noformat} Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} was: 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) hive升级遇到问题 -- Key: HIVE-11247 URL: https://issues.apache.org/jira/browse/HIVE-11247 Project: Hive Issue Type: Bug Reporter: hongyan Assignee: hongyan Priority: Critical 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 {noformat} Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at
[jira] [Commented] (HIVE-11145) Remove OFFLINE and NO_DROP from tables and partitions
[ https://issues.apache.org/jira/browse/HIVE-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608209#comment-14608209 ] Damien Carol commented on HIVE-11145: - [~alangates] You forgot {{AlterTableDesc.ALTERPARTITIONPROTECTMODE}} Remove OFFLINE and NO_DROP from tables and partitions - Key: HIVE-11145 URL: https://issues.apache.org/jira/browse/HIVE-11145 Project: Hive Issue Type: Improvement Components: Metastore, SQL Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-11145.patch Currently a table or partition can be marked no_drop or offline. This prevents users from dropping or reading (and dropping) the table or partition. This was built in 0.7 before SQL standard authorization was an option. This is an expensive feature as when a table is dropped every partition must be fetched and checked to make sure it can be dropped. This feature is also redundant now that real authorization is available in Hive. This feature should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11145) Remove OFFLINE and NO_DROP from tables and partitions
[ https://issues.apache.org/jira/browse/HIVE-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608213#comment-14608213 ] Damien Carol commented on HIVE-11145: - In v1, in file ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java (line 55) Remove OFFLINE and NO_DROP from tables and partitions - Key: HIVE-11145 URL: https://issues.apache.org/jira/browse/HIVE-11145 Project: Hive Issue Type: Improvement Components: Metastore, SQL Affects Versions: 2.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-11145.patch Currently a table or partition can be marked no_drop or offline. This prevents users from dropping or reading (and dropping) the table or partition. This was built in 0.7 before SQL standard authorization was an option. This is an expensive feature as when a table is dropped every partition must be fetched and checked to make sure it can be dropped. This feature is also redundant now that real authorization is available in Hive. This feature should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11149) Fix issue with Thread unsafe Class HashMap in PerfLogger.java hangs in Multi-thread environment
[ https://issues.apache.org/jira/browse/HIVE-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11149: Fix Version/s: (was: 1.2.0) Fix issue with Thread unsafe Class HashMap in PerfLogger.java hangs in Multi-thread environment --- Key: HIVE-11149 URL: https://issues.apache.org/jira/browse/HIVE-11149 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 1.2.0 Reporter: WangMeng Assignee: WangMeng Attachments: HIVE-11149.01.patch In Multi-thread environment, the Thread unsafe Class HashMap in PerfLogger.java will casue massive Java Processes hang and cost large amounts of unnecessary CPU and Memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11149) Fix issue with Thread unsafe Class HashMap in PerfLogger.java hangs in Multi-thread environment
[ https://issues.apache.org/jira/browse/HIVE-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608345#comment-14608345 ] Damien Carol commented on HIVE-11149: - Removed fix version 1.2.0 as this version is already released. Fix issue with Thread unsafe Class HashMap in PerfLogger.java hangs in Multi-thread environment --- Key: HIVE-11149 URL: https://issues.apache.org/jira/browse/HIVE-11149 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 1.2.0 Reporter: WangMeng Assignee: WangMeng Attachments: HIVE-11149.01.patch In Multi-thread environment, the Thread unsafe Class HashMap in PerfLogger.java will casue massive Java Processes hang and cost large amounts of unnecessary CPU and Memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10673) Dynamically partitioned hash join for Tez
[ https://issues.apache.org/jira/browse/HIVE-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606278#comment-14606278 ] Damien Carol commented on HIVE-10673: - \ No newline at end of file Dynamically partitioned hash join for Tez - Key: HIVE-10673 URL: https://issues.apache.org/jira/browse/HIVE-10673 Project: Hive Issue Type: New Feature Components: Query Planning, Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, HIVE-10673.3.patch, HIVE-10673.4.patch, HIVE-10673.5.patch Reduce-side hash join (using MapJoinOperator), where the Tez inputs to the reducer are unsorted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11112) ISO-8859-1 text output has fragments of previous longer rows appended
[ https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-2: Description: If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query results for a string column are incorrect for any row that was preceded by a row containing a longer string. Example steps to reproduce: 1. Create a table using ISO 8859-1 encoding: {code:sql} CREATE TABLE person_lat1 (name STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('serialization.encoding'='ISO8859_1'); {code} 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder in HDFS. I'll attach an example file containing the following text: {noformat} Müller,Thomas Jørgensen,Jørgen Peña,Andrés Nåm,Fæk {noformat} 3. Execute {{SELECT * FROM person_lat1}} Result - The following output appears: {noformat} +---+--+ | person_lat1.name | +---+--+ | Müller,Thomas | | Jørgensen,Jørgen | | Peña,Andrésørgen | | Nåm,Fækdrésørgen | +---+--+ {noformat} was: If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query results for a string column are incorrect for any row that was preceded by a row containing a longer string. Example steps to reproduce: 1. Create a table using ISO 8859-1 encoding: CREATE TABLE person_lat1 (name STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('serialization.encoding'='ISO8859_1'); 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder in HDFS. I'll attach an example file containing the following text: Müller,Thomas Jørgensen,Jørgen Peña,Andrés Nåm,Fæk 3. Execute SELECT * FROM person_lat1 Result - The following output appears: +---+--+ | person_lat1.name | +---+--+ | Müller,Thomas | | Jørgensen,Jørgen | | Peña,Andrésørgen | | Nåm,Fækdrésørgen | +---+--+ ISO-8859-1 text output has fragments of previous longer rows appended - Key: HIVE-2 URL: https://issues.apache.org/jira/browse/HIVE-2 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-2.1.patch If a LazySimpleSerDe table is created using ISO 8859-1 encoding, query results for a string column are incorrect for any row that was preceded by a row containing a longer string. Example steps to reproduce: 1. Create a table using ISO 8859-1 encoding: {code:sql} CREATE TABLE person_lat1 (name STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('serialization.encoding'='ISO8859_1'); {code} 2. Copy an ISO-8859-1 encoded text file into the appropriate warehouse folder in HDFS. I'll attach an example file containing the following text: {noformat} Müller,Thomas Jørgensen,Jørgen Peña,Andrés Nåm,Fæk {noformat} 3. Execute {{SELECT * FROM person_lat1}} Result - The following output appears: {noformat} +---+--+ | person_lat1.name | +---+--+ | Müller,Thomas | | Jørgensen,Jørgen | | Peña,Andrésørgen | | Nåm,Fækdrésørgen | +---+--+ {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10880: Description: When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. Reproduce: {code:sql} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; {code} Then I inserted the following data into the buckettestinput table: {noformat} firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 {noformat} {code:sql} set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); {code} {noformat} Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} Insert use dynamic partition does not have the issue. was: When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; Then I inserted the following data into the buckettestinput table firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where
[jira] [Commented] (HIVE-10958) Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails
[ https://issues.apache.org/jira/browse/HIVE-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600966#comment-14600966 ] Damien Carol commented on HIVE-10958: - [~pxiong] why Centos in the JIRA name and description? Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails -- Key: HIVE-10958 URL: https://issues.apache.org/jira/browse/HIVE-10958 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.1, 2.0.0 Attachments: HIVE-10958.01.patch Centos: TestMiniTezCliDriver.testCliDriver_mergejoin fails due to the statement set mapred.reduce.tasks = 18; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10795) Remove use of PerfLogger from Orc
[ https://issues.apache.org/jira/browse/HIVE-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597871#comment-14597871 ] Damien Carol commented on HIVE-10795: - [~owen.omalley] you should remove _CLASS_NAME_ (line:41) Remove use of PerfLogger from Orc - Key: HIVE-10795 URL: https://issues.apache.org/jira/browse/HIVE-10795 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-10795.patch, HIVE-10795.patch PerfLogger is yet another class with a huge dependency set that Orc doesn't need. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0
[ https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11040: Component/s: Metastore Change Derby dependency version to 10.10.2.0 Key: HIVE-11040 URL: https://issues.apache.org/jira/browse/HIVE-11040 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere Assignee: Jason Dere Fix For: 1.2.1, 2.0.0 Attachments: HIVE-11040.1.patch We don't see this on the Apache pre-commit tests because it uses PTest, but running the entire TestCliDriver suite results in failures in some of the partition-related qtests (partition_coltype_literals, partition_date, partition_date2). I've only really seen this on Linux (I was using CentOS). HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related tests to pass. I'd like to change the dependency version to 10.20.2.0, since that version should also contain the fix for HIVE-8879. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11040) Change Derby dependency version to 10.10.2.0
[ https://issues.apache.org/jira/browse/HIVE-11040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11040: Fix Version/s: 2.0.0 Change Derby dependency version to 10.10.2.0 Key: HIVE-11040 URL: https://issues.apache.org/jira/browse/HIVE-11040 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Fix For: 1.2.1, 2.0.0 Attachments: HIVE-11040.1.patch We don't see this on the Apache pre-commit tests because it uses PTest, but running the entire TestCliDriver suite results in failures in some of the partition-related qtests (partition_coltype_literals, partition_date, partition_date2). I've only really seen this on Linux (I was using CentOS). HIVE-8879 changed the Derby dependency version from 10.10.1.1 to 10.11.1.1. Testing with 10.10.1.1 or 10.20.2.0 seems to allow the partition related tests to pass. I'd like to change the dependency version to 10.20.2.0, since that version should also contain the fix for HIVE-8879. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10746) Hive 1.2.0+Tez produces 1-byte FileSplits from mapred.TextInputFormat
[ https://issues.apache.org/jira/browse/HIVE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-10746: Description: The following query: {code:sql} SELECT appl_user_id, arsn_cd, COUNT(*) as RecordCount FROM adw.crc_arsn GROUP BY appl_user_id,arsn_cd ORDER BY appl_user_id; {code} runs consistently fast in Spark and Mapreduce on Hive 1.2.0. When attempting to run this same query against Tez as the execution engine it consistently runs for over 300-500 seconds this seems extremely long. This is a basic external table delimited by tabs and is a single file in a folder. In Hive 0.13 this query with Tez runs fast and I tested with Hive 0.14, 0.14.1/1.0.0 and now Hive 1.2.0 and there clearly is something going awry with Hive w/Tez as an execution engine with Single or small file tables. I can attach further logs if someone needs them for deeper analysis. HDFS Output: {noformat} hadoop fs -ls /example_dw/crc/arsn Found 2 items -rwxr-x--- 6 loaduser hadoopusers 0 2015-05-17 20:03 /example_dw/crc/arsn/_SUCCESS -rwxr-x--- 6 loaduser hadoopusers3883880 2015-05-17 20:03 /example_dw/crc/arsn/part-m-0 {noformat} Hive Table Describe: {noformat} hive describe formatted crc_arsn; OK # col_name data_type comment arsn_cd string clmlvl_cd string arclss_cd string arclssg_cd string arsn_prcsr_rmk_ind string arsn_mbr_rspns_ind string savtyp_cd string arsn_eff_dt string arsn_exp_dt string arsn_pstd_dts string arsn_lstupd_dts string arsn_updrsn_txt string appl_user_idstring arsntyp_cd string pre_d_indicator string arsn_display_txtstring arstat_cd string arsn_tracking_nostring arsn_cstspcfc_ind string arsn_mstr_rcrd_ind string state_specific_ind string region_specific_in string arsn_dpndnt_cd string unit_adjustment_in string arsn_mbr_only_ind string arsn_qrmb_ind string # Detailed Table Information Database: adw Owner: loadu...@exa.example.com CreateTime: Mon Apr 28 13:28:05 EDT 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://xhadnnm1p.example.com:8020/example_dw/crc/arsn Table Type: EXTERNAL_TABLE Table Parameters: EXTERNALTRUE transient_lastDdlTime 1398706085 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat:org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets:-1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: field.delim \t line.delim \n serialization.format\t Time taken: 1.245 seconds, Fetched: 54 row(s) {noformat} Explain Hive 1.2.0 w/Tez: {noformat} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) Explain Hive 0.13 w/Tez: STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Tez
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Attachment: HIVE-9511.4.patch Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.4.patch, HIVE-9511.patch.txt Tez 0.6.1 has been released. Research to switch to version 0.6.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Attachment: (was: HIVE-9511.4.patch.txt) Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.4.patch, HIVE-9511.patch.txt Tez 0.6.1 has been released. Research to switch to version 0.6.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11006) improve logging wrt ACID module
[ https://issues.apache.org/jira/browse/HIVE-11006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-11006: Fix Version/s: 2.0.0 improve logging wrt ACID module --- Key: HIVE-11006 URL: https://issues.apache.org/jira/browse/HIVE-11006 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 1.2.1, 2.0.0 Attachments: HIVE-11006.2.patch, HIVE-11006.patch especially around metastore DB operations (TxnHandler) which are retried or fail for some reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Summary: Switch Tez to 0.6.1 (was: Switch Tez to 0.6.0) Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.patch.txt Tez 0.6.0 has been released. Research to switch to version 0.6.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Component/s: Tez Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.patch.txt Tez 0.6.1 has been released. Research to switch to version 0.6.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Description: Tez 0.6.1 has been released. Research to switch to version 0.6.1 was: Tez 0.6.0 has been released. Research to switch to version 0.6.0 Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.patch.txt Tez 0.6.1 has been released. Research to switch to version 0.6.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9511) Switch Tez to 0.6.1
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9511: --- Attachment: HIVE-9511.4.patch.txt Updated to TEZ 0.6.1. Switch Tez to 0.6.1 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Components: Tez Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.2.patch, HIVE-9511.3.patch.txt, HIVE-9511.4.patch.txt, HIVE-9511.patch.txt Tez 0.6.1 has been released. Research to switch to version 0.6.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)