[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
[ https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051129#comment-14051129 ] Lefty Leverenz commented on HIVE-6586: -- HIVE-6697 added hive.server2.authentication.spnego.keytab and hive.server2.authentication.spnego.principal in 0.13.0. They aren't in patch HIVE-6037-0.13.0. Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos) --- Key: HIVE-6586 URL: https://issues.apache.org/jira/browse/HIVE-6586 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Lefty Leverenz Labels: TODOC14 HIVE-6037 puts the definitions of configuration parameters into the HiveConf.java file, but several recent jiras for release 0.13.0 introduce new parameters that aren't in HiveConf.java yet and some parameter definitions need to be altered for 0.13.0. This jira will patch HiveConf.java after HIVE-6037 gets committed. Also, four typos patched in HIVE-6582 need to be fixed in the new HiveConf.java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego
[ https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051130#comment-14051130 ] Lefty Leverenz commented on HIVE-6697: -- *hive.server2.authentication.spnego.keytab* and *hive.server2.authentication.spnego.principal* are documented in the wiki here: * [Configuration Properties -- hive.server2.authentication.spnego.keytab | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.spnego.keytab] * [Configuration Properties -- hive.server2.authentication.spnego.principal | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.authentication.spnego.principal] I added a comment to HIVE-6586 so they won't get lost in the shuffle when HIVE-6037 changes HiveConf.java. HiveServer2 secure thrift/http authentication needs to support SPNego -- Key: HIVE-6697 URL: https://issues.apache.org/jira/browse/HIVE-6697 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Dilli Arumugam Fix For: 0.13.0 Attachments: HIVE-6697.1.patch, HIVE-6697.2.patch, HIVE-6697.3.patch, HIVE-6697.4.patch, hive-6697-req-impl-verify.md Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http. Found that thrift/http uses some form of Kerberos authentication that is not SPNego. Considering it is going over http protocol, expected it to use SPNego protocol. Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase Stargate using SPNego for authentication. Requesting that HiveServer2 secure thrift/http authentication support SPNego. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7289) revert HIVE-6469
[ https://issues.apache.org/jira/browse/HIVE-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7289: - Tags: (was: TODOC14) revert HIVE-6469 Key: HIVE-7289 URL: https://issues.apache.org/jira/browse/HIVE-7289 Project: Hive Issue Type: Task Components: CLI Affects Versions: 0.14.0 Reporter: Jayesh Assignee: Jayesh Fix For: 0.14.0 Attachments: HIVE-7289.patch this task is to revert HIVE-6469 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7289) revert HIVE-6469
[ https://issues.apache.org/jira/browse/HIVE-7289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051142#comment-14051142 ] Lefty Leverenz commented on HIVE-7289: -- No need to revert documentation for HIVE-6469, because it wasn't done yet. revert HIVE-6469 Key: HIVE-7289 URL: https://issues.apache.org/jira/browse/HIVE-7289 Project: Hive Issue Type: Task Components: CLI Affects Versions: 0.14.0 Reporter: Jayesh Assignee: Jayesh Fix For: 0.14.0 Attachments: HIVE-7289.patch this task is to revert HIVE-6469 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6469) skipTrash option in hive command line
[ https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6469: - Labels: (was: TODOC14) skipTrash option in hive command line - Key: HIVE-6469 URL: https://issues.apache.org/jira/browse/HIVE-6469 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.12.0 Reporter: Jayesh Assignee: Jayesh Fix For: 0.14.0 Attachments: HIVE-6469.1.patch, HIVE-6469.2.patch, HIVE-6469.3.patch, HIVE-6469.patch Th current behavior of hive metastore during a drop table table_name command is to delete the data from HDFS warehouse and put it into Trash. Currently there is no way to provide a flag to tell the warehouse to skip trash while deleting table data. This ticket is to add skipTrash configuration hive.warehouse.data.skipTrash , which when set to true, will skipTrash while dropping table data from hdfs warehouse. This will be set to false by default to keep current behavior. This would be good feature to add, so that an admin of the cluster can specify when not to put data into the trash directory (eg. in a dev environment) and thus not to fill hdfs space instead of relying on trash interval and policy configuration to take care of disk filling issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6469) skipTrash option in hive command line
[ https://issues.apache.org/jira/browse/HIVE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051144#comment-14051144 ] Lefty Leverenz commented on HIVE-6469: -- This was reverted by HIVE-7289 so no documentation is needed after all. skipTrash option in hive command line - Key: HIVE-6469 URL: https://issues.apache.org/jira/browse/HIVE-6469 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.12.0 Reporter: Jayesh Assignee: Jayesh Fix For: 0.14.0 Attachments: HIVE-6469.1.patch, HIVE-6469.2.patch, HIVE-6469.3.patch, HIVE-6469.patch Th current behavior of hive metastore during a drop table table_name command is to delete the data from HDFS warehouse and put it into Trash. Currently there is no way to provide a flag to tell the warehouse to skip trash while deleting table data. This ticket is to add skipTrash configuration hive.warehouse.data.skipTrash , which when set to true, will skipTrash while dropping table data from hdfs warehouse. This will be set to false by default to keep current behavior. This would be good feature to add, so that an admin of the cluster can specify when not to put data into the trash directory (eg. in a dev environment) and thus not to fill hdfs space instead of relying on trash interval and policy configuration to take care of disk filling issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
dima machlin created HIVE-7346: -- Summary: Wrong results caused by hive ppd under specific join condition Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9 type: int outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local
[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051172#comment-14051172 ] Lefty Leverenz commented on HIVE-6468: -- Thanks for the description of *hive.server2.sasl.message.limit* in hive-default.xml.template. (If you're doing another patch, it would be good to capitalize SASL in the description.) HS2 out of memory error when curl sends a get request - Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab Assignee: Navis Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
[ https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051193#comment-14051193 ] Lefty Leverenz commented on HIVE-6586: -- HIVE-5351 added three HiveServer2 configuration parameters in 0.13.0. Patch HIVE-6037-0.13.0 includes them without their descriptions, which are: * hive.server2.use.SSL: Set this to true for using SSL encryption in HiveServer2. * hive.server2.keystore.path: SSL certificate keystore location. * hive.server2.keystore.password: SSL certificate keystore password. Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos) --- Key: HIVE-6586 URL: https://issues.apache.org/jira/browse/HIVE-6586 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Lefty Leverenz Labels: TODOC14 HIVE-6037 puts the definitions of configuration parameters into the HiveConf.java file, but several recent jiras for release 0.13.0 introduce new parameters that aren't in HiveConf.java yet and some parameter definitions need to be altered for 0.13.0. This jira will patch HiveConf.java after HIVE-6037 gets committed. Also, four typos patched in HIVE-6582 need to be fixed in the new HiveConf.java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
[ https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051212#comment-14051212 ] Lefty Leverenz commented on HIVE-6586: -- HIVE-6643 added hive.exec.check.crossproducts in 0.13.0. It isn't in patch HIVE-6037-0.13.0. Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos) --- Key: HIVE-6586 URL: https://issues.apache.org/jira/browse/HIVE-6586 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Lefty Leverenz Labels: TODOC14 HIVE-6037 puts the definitions of configuration parameters into the HiveConf.java file, but several recent jiras for release 0.13.0 introduce new parameters that aren't in HiveConf.java yet and some parameter definitions need to be altered for 0.13.0. This jira will patch HiveConf.java after HIVE-6037 gets committed. Also, four typos patched in HIVE-6582 need to be fixed in the new HiveConf.java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6643) Add a check for cross products in plans and output a warning
[ https://issues.apache.org/jira/browse/HIVE-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051220#comment-14051220 ] Lefty Leverenz commented on HIVE-6643: -- *hive.exec.check.crossproducts* is documented in the wiki here: * [Configuration Properties -- hive.exec.check.crossproducts | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.check.crossproducts] I added a comment to HIVE-6586 so it won't get lost in the shuffle when HIVE-6037 changes HiveConf.java. Add a check for cross products in plans and output a warning Key: HIVE-6643 URL: https://issues.apache.org/jira/browse/HIVE-6643 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Fix For: 0.13.0 Attachments: HIVE-6643.1.patch, HIVE-6643.2.patch, HIVE-6643.3.patch, HIVE-6643.4.patch, HIVE-6643.5.patch, HIVE-6643.6.patch, HIVE-6643.7.patch Now that we support old style join syntax, it is easy to write queries that generate a plan with a cross product. For e.g. say you have A join B join C join D on A.x = B.x and A.y = D.y and C.z = D.z So the JoinTree is: A — B |__ D — C Since we don't reorder join graphs, we will end up with a cross product between (A join B) and C -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7205: Attachment: HIVE-7205.3.patch.txt Wrong results when union all of grouping followed by group by with correlation optimization --- Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Priority: Critical Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, HIVE-7205.3.patch.txt use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Union Select Operator expressions: expr: _col0 type: string expr: _col1
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-1662: Attachment: HIVE-1662.18.patch.txt Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, HIVE-1662.14.patch.txt, HIVE-1662.15.patch.txt, HIVE-1662.16.patch.txt, HIVE-1662.17.patch.txt, HIVE-1662.18.patch.txt, HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teng Yutong updated HIVE-6584: -- Attachment: HIVE-6584.7.patch fix some bug..but still need changes on HBase side Add HiveHBaseTableSnapshotInputFormat - Key: HIVE-6584 URL: https://issues.apache.org/jira/browse/HIVE-6584 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.14.0 Attachments: HIVE-6584.0.patch, HIVE-6584.1.patch, HIVE-6584.2.patch, HIVE-6584.3.patch, HIVE-6584.4.patch, HIVE-6584.5.patch, HIVE-6584.6.patch, HIVE-6584.7.patch HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing the online region server API provides a nice performance boost for the full scan. HBASE-10642 is backporting that feature to 0.94/0.96 and also adding a {{mapred}} implementation. Once that's available, we should add an input format. A follow-on patch could work out how to integrate this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat into existing table definitions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051350#comment-14051350 ] Hive QA commented on HIVE-1662: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653796/HIVE-1662.18.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5673 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/670/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/670/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-670/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653796 Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.10.patch.txt, HIVE-1662.11.patch.txt, HIVE-1662.12.patch.txt, HIVE-1662.13.patch.txt, HIVE-1662.14.patch.txt, HIVE-1662.15.patch.txt, HIVE-1662.16.patch.txt, HIVE-1662.17.patch.txt, HIVE-1662.18.patch.txt, HIVE-1662.8.patch.txt, HIVE-1662.9.patch.txt, HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch, HIVE-1662.D8391.5.patch, HIVE-1662.D8391.6.patch, HIVE-1662.D8391.7.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message was sent by Atlassian JIRA (v6.2#6252)
OutofMemory Error after starting Hive metastore
Hi All, when I start hive metastore by running issued the hive --service metastore,But we are seeing this error. Any pointers on how to solve this ? Exception in thread pool-6-thread-1 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:76) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) la...@ucweb.com
[jira] [Commented] (HIVE-7344) Add streaming support in Windowing mode for FirstVal, LastVal
[ https://issues.apache.org/jira/browse/HIVE-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051376#comment-14051376 ] Ashutosh Chauhan commented on HIVE-7344: +1 Add streaming support in Windowing mode for FirstVal, LastVal - Key: HIVE-7344 URL: https://issues.apache.org/jira/browse/HIVE-7344 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7344.1.patch Continuation of HIVE-7062, HIVE-7143 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051444#comment-14051444 ] Mark Grey commented on HIVE-6050: - Is there a workaround for overcoming the Thrift problem when connecting newer clients to an older hiveserver2? I'd like to hook up Hue into an existing hiveserver in production. JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2843) UDAF to convert an aggregation to a map
[ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay Ratnagiri updated HIVE-2843: -- Labels: UDAF features udf (was: features udf) UDAF to convert an aggregation to a map --- Key: HIVE-2843 URL: https://issues.apache.org/jira/browse/HIVE-2843 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.9.0, 0.10.0 Reporter: David Worms Priority: Minor Labels: UDAF, features, udf Attachments: HIVE-2843.1.patch.txt, HIVE-2843.D8745.1.patch, hive-2843-dev.git.patch I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: UDAFToMap and UDAFToOrderedMap. The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class. Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/ The full patch is available with tests as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051489#comment-14051489 ] Hive QA commented on HIVE-7205: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653795/HIVE-7205.3.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5658 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/671/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/671/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-671/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653795 Wrong results when union all of grouping followed by group by with correlation optimization --- Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Priority: Critical Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, HIVE-7205.3.patch.txt use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0
[jira] [Commented] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051499#comment-14051499 ] Hive QA commented on HIVE-6584: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653797/HIVE-6584.7.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/672/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/672/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-672/ Messages: {noformat} This message was trimmed, see log for full details [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-contrib --- [INFO] Compiling 39 source files to /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/classes [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/java/org/apache/hadoop/hive/contrib/udaf/example/UDAFExampleMax.java: Recompile with -Xlint:deprecation for details. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleStructPrint.java: /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleStructPrint.java uses unchecked or unsafe operations. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleStructPrint.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-contrib --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-contrib --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-contrib --- [INFO] Compiling 2 source files to /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/test-classes [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestRegexSerDe.java: /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestRegexSerDe.java uses or overrides a deprecated API. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestRegexSerDe.java: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-contrib --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-contrib --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/hive-contrib-0.14.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-contrib --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-contrib --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/contrib/target/hive-contrib-0.14.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-contrib/0.14.0-SNAPSHOT/hive-contrib-0.14.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/contrib/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-contrib/0.14.0-SNAPSHOT/hive-contrib-0.14.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive HBase Handler 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hbase-handler --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-hbase-handler --- [INFO]
[jira] [Commented] (HIVE-6694) Beeline should provide a way to execute shell command as Hive CLI does
[ https://issues.apache.org/jira/browse/HIVE-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051621#comment-14051621 ] Brock Noland commented on HIVE-6694: +1 Beeline should provide a way to execute shell command as Hive CLI does -- Key: HIVE-6694 URL: https://issues.apache.org/jira/browse/HIVE-6694 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.14.0 Attachments: HIVE-6694.1.patch, HIVE-6694.1.patch, HIVE-6694.2.patch, HIVE-6694.3.patch, HIVE-6694.4.patch, HIVE-6694.5.patch, HIVE-6694.patch Hive CLI allows a user to execute a shell command using ! notation. For instance, !cat myfile.txt. Being able to execute shell command may be important for some users. As a replacement, however, Beeline provides no such capability, possibly because ! notation is reserved for SQLLine commands. It's possible to provide this using a slightly syntactic variation such as !sh cat myfilie.txt. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7347) Pig Query with defined schema fails when submitted via WebHcat -Query parameter
Azim Uddin created HIVE-7347: Summary: Pig Query with defined schema fails when submitted via WebHcat -Query parameter Key: HIVE-7347 URL: https://issues.apache.org/jira/browse/HIVE-7347 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0, 0.12.0 Environment: HDP 2.1 on Windows; HDInsight deploying HDP 2.1 Reporter: Azim Uddin 1. Consider you are using HDP 2.1 on Windows, and you have a tsv file (named rawInput.tsv) like this (just an example, you can use any) - http://a.comhttp://b.com1 http://b.comhttp://c.com2 http://d.comhttp://e.com3 2. With the tsv file uploaded to HDFS, run the following Pig job via WebHcat using 'execute' parameter, something like this- curl.exe -d execute=rawInput = load '/test/data' using PigStorage as (SourceUrl:chararray, DestinationUrl:chararray, InstanceCount:int); readyInput = limit rawInput 10; store readyInput into '/test/output' using PigStorage; -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any The job fails with exit code 255 - [main] org.apache.hive.hcatalog.templeton.tool.LaunchMapper: templeton: job failed with exit code 255 From stderr, we see the following -readyInput was unexpected at this time. 3. The same job works via Pig Grunt Shell and if we use the WebHcat 'file' parameter, instead of 'execute' parameter - a. Create a pig script called pig-script.txt with the query below and put it HDFS /test/script rawInput = load '/test/data' using PigStorage as (SourceUrl:chararray, DestinationUrl:chararray, InstanceCount:int); readyInput = limit rawInput 10; store readyInput into '/test/Output' using PigStorage; b. Run the job via webHcat: curl.exe -d file=/test/script/pig_script.txt -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any 4. Also, WebHcat 'execute' option works if we don't define the schema in the Pig query, something like this- curl.exe -d execute=rawInput = load '/test/data' using PigStorage; readyInput = limit rawInput 10; store readyInput into '/test/output' using PigStorage; -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any Ask is- WebHcat 'execute' option should work for Pig query with schema defined - it appears to be a parsing issue with WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7257) UDF format_number() does not work on FLOAT types
[ https://issues.apache.org/jira/browse/HIVE-7257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051774#comment-14051774 ] Szehon Ho commented on HIVE-7257: - Committed to trunk. Thanks Wilbur for the contribution! UDF format_number() does not work on FLOAT types Key: HIVE-7257 URL: https://issues.apache.org/jira/browse/HIVE-7257 Project: Hive Issue Type: Bug Reporter: Wilbur Yang Assignee: Wilbur Yang Fix For: 0.14.0 Attachments: HIVE-7257.1.patch #1 Show the table: hive describe ssga3; OK sourcestring test float dttimestamp Time taken: 0.243 seconds #2 Run format_number on double and it works: hive select format_number(cast(test as double),2) from ssga3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201403131616_0009, Tracking URL = http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0009 Kill Command = /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job -kill job_201403131616_0009 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2014-03-13 17:14:53,992 Stage-1 map = 0%, reduce = 0% 2014-03-13 17:14:59,032 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:00,046 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:01,056 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:02,067 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.47 sec MapReduce Total cumulative CPU time: 1 seconds 470 msec Ended Job = job_201403131616_0009 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 1.47 sec HDFS Read: 299 HDFS Write: 10 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 470 msec OK 1.00 2.00 Time taken: 16.563 seconds #3 Run format_number on float and it does not work hive select format_number(test,2) from ssga3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201403131616_0010, Tracking URL = http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010 Kill Command = /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job -kill job_201403131616_0010 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2014-03-13 17:20:21,158 Stage-1 map = 0%, reduce = 0% 2014-03-13 17:21:00,453 Stage-1 map = 100%, reduce = 100% Ended Job = job_201403131616_0010 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010 Examining task ID: task_201403131616_0010_m_02 (and more) from job job_201403131616_0010 Unable to retrieve URL for Hadoop Task logs. Does not contain a valid host:port authority: logicaljt Task with the most failures(4): Task ID: task_201403131616_0010_m_00 Diagnostic Messages for this Task: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {source:null,test:1.0,dt:null} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {source:null,test:1.0,dt:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141) .. FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7257) UDF format_number() does not work on FLOAT types
[ https://issues.apache.org/jira/browse/HIVE-7257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7257: Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) UDF format_number() does not work on FLOAT types Key: HIVE-7257 URL: https://issues.apache.org/jira/browse/HIVE-7257 Project: Hive Issue Type: Bug Reporter: Wilbur Yang Assignee: Wilbur Yang Fix For: 0.14.0 Attachments: HIVE-7257.1.patch #1 Show the table: hive describe ssga3; OK sourcestring test float dttimestamp Time taken: 0.243 seconds #2 Run format_number on double and it works: hive select format_number(cast(test as double),2) from ssga3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201403131616_0009, Tracking URL = http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0009 Kill Command = /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job -kill job_201403131616_0009 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2014-03-13 17:14:53,992 Stage-1 map = 0%, reduce = 0% 2014-03-13 17:14:59,032 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:00,046 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:01,056 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 sec 2014-03-13 17:15:02,067 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 1.47 sec MapReduce Total cumulative CPU time: 1 seconds 470 msec Ended Job = job_201403131616_0009 MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 1.47 sec HDFS Read: 299 HDFS Write: 10 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 470 msec OK 1.00 2.00 Time taken: 16.563 seconds #3 Run format_number on float and it does not work hive select format_number(test,2) from ssga3; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201403131616_0010, Tracking URL = http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010 Kill Command = /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job -kill job_201403131616_0010 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2014-03-13 17:20:21,158 Stage-1 map = 0%, reduce = 0% 2014-03-13 17:21:00,453 Stage-1 map = 100%, reduce = 100% Ended Job = job_201403131616_0010 with errors Error during job, obtaining debugging information... Job Tracking URL: http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010 Examining task ID: task_201403131616_0010_m_02 (and more) from job job_201403131616_0010 Unable to retrieve URL for Hadoop Task logs. Does not contain a valid host:port authority: logicaljt Task with the most failures(4): Task ID: task_201403131616_0010_m_00 Diagnostic Messages for this Task: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {source:null,test:1.0,dt:null} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {source:null,test:1.0,dt:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141) .. FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6981) Remove old website from SVN
[ https://issues.apache.org/jira/browse/HIVE-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051794#comment-14051794 ] Szehon Ho commented on HIVE-6981: - Hi [~leftylev] and [~brocknoland], should we update the wiki at [https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-CommittingDocumentation|https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-CommittingDocumentation]? It refers to the removed svn site, and caused some confusion for me until I found this JIRA. I guess it can link to Brock's page about the new site: [https://cwiki.apache.org/confluence/display/Hive/How+to+edit+the+website|https://cwiki.apache.org/confluence/display/Hive/How+to+edit+the+website]. Remove old website from SVN --- Key: HIVE-6981 URL: https://issues.apache.org/jira/browse/HIVE-6981 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Command to do removal: {noformat} svn delete https://svn.apache.org/repos/asf/hive/site/ --message HIVE-6981 - Remove old website from SVN {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4605) Hive job fails while closing reducer output - Unable to rename
[ https://issues.apache.org/jira/browse/HIVE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051872#comment-14051872 ] Arif Iqbal commented on HIVE-4605: -- I'm also seeing exact same issue with hive-0.11. The exception message is attached below It looks like that the bug started happening when we started using the command in hive create table as org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: hdfs://ares-nn.vip.ebay.com:8020/tmp/abc/hive_2014-07-02_22-28-32_991_2129131884187129074/_task_tmp.-ext-10001/_tmp.00_0 to: hdfs://ares-nn.vip.ebay.com:8020/tmp/abc/hive_2014-07-02_22-28-32_991_2129131884187129074/_tmp.-ext-10001/00_0 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:197) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:108) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:867) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:194) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:249) org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: hdfs://ares-nn.vip.ebay.com:8020/tmp/abc/hive_2014-07-02_22-28-32_991_2129131884187129074/_task_tmp.-ext-10001/_tmp.00_0 to: hdfs://ares-nn.vip.ebay.com:8020/tmp/abc/hive_2014-07-02_22-28-32_991_2129131884187129074/_tmp.-ext-10001/00_0 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:197) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:108) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:867) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:194) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:249) Hive job fails while closing reducer output - Unable to rename -- Key: HIVE-4605 URL: https://issues.apache.org/jira/browse/HIVE-4605 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Environment: OS: 2.6.18-194.el5xen #1 SMP Fri Apr 2 15:34:40 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux Hadoop 1.1.2 Reporter: Link Qian Assignee: Brock Noland 1, create a table with ORC storage model create table iparea_analysis_orc (network int, ip string, ) stored as ORC; 2, insert table iparea_analysis_orc select network, ip, , the script success, but failed after add *OVERWRITE* keyword. the main error log list as here. ava.lang.RuntimeException: Hive Runtime Error while closing operators: Unable to rename output from: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_task_tmp.-ext-1/_tmp.00_0 to: hdfs://qa3hop001.uucun.com:9000/tmp/hive-hadoop/hive_2013-05-24_15-11-06_511_7746839019590922068/_tmp.-ext-1/00_0 at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at
[jira] [Updated] (HIVE-7347) Pig Query with defined schema fails when submitted via WebHcat 'execute' parameter
[ https://issues.apache.org/jira/browse/HIVE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Azim Uddin updated HIVE-7347: - Summary: Pig Query with defined schema fails when submitted via WebHcat 'execute' parameter (was: Pig Query with defined schema fails when submitted via WebHcat -Query parameter) Pig Query with defined schema fails when submitted via WebHcat 'execute' parameter -- Key: HIVE-7347 URL: https://issues.apache.org/jira/browse/HIVE-7347 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0 Environment: HDP 2.1 on Windows; HDInsight deploying HDP 2.1 Reporter: Azim Uddin 1. Consider you are using HDP 2.1 on Windows, and you have a tsv file (named rawInput.tsv) like this (just an example, you can use any) - http://a.com http://b.com1 http://b.com http://c.com2 http://d.com http://e.com3 2. With the tsv file uploaded to HDFS, run the following Pig job via WebHcat using 'execute' parameter, something like this- curl.exe -d execute=rawInput = load '/test/data' using PigStorage as (SourceUrl:chararray, DestinationUrl:chararray, InstanceCount:int); readyInput = limit rawInput 10; store readyInput into '/test/output' using PigStorage; -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any The job fails with exit code 255 - [main] org.apache.hive.hcatalog.templeton.tool.LaunchMapper: templeton: job failed with exit code 255 From stderr, we see the following -readyInput was unexpected at this time. 3. The same job works via Pig Grunt Shell and if we use the WebHcat 'file' parameter, instead of 'execute' parameter - a. Create a pig script called pig-script.txt with the query below and put it HDFS /test/script rawInput = load '/test/data' using PigStorage as (SourceUrl:chararray, DestinationUrl:chararray, InstanceCount:int); readyInput = limit rawInput 10; store readyInput into '/test/Output' using PigStorage; b. Run the job via webHcat: curl.exe -d file=/test/script/pig_script.txt -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any 4. Also, WebHcat 'execute' option works if we don't define the schema in the Pig query, something like this- curl.exe -d execute=rawInput = load '/test/data' using PigStorage; readyInput = limit rawInput 10; store readyInput into '/test/output' using PigStorage; -d statusdir=/test/status http://localhost:50111/templeton/v1/pig?user.name=hadoop; --user hadoop:any Ask is- WebHcat 'execute' option should work for Pig query with schema defined - it appears to be a parsing issue with WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7268) On Windows Hive jobs in Webhcat always run on default MR mode
[ https://issues.apache.org/jira/browse/HIVE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051932#comment-14051932 ] Sushanth Sowmyan commented on HIVE-7268: Looks good to me, +1. On Windows Hive jobs in Webhcat always run on default MR mode - Key: HIVE-7268 URL: https://issues.apache.org/jira/browse/HIVE-7268 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7268.1.patch On Windows fix from HIVE-7065 doesn't work as the templeton.cmd script does not include the Hive configuration directory in the classpath. So when hive.execution.engine property is set to tez in HIVE_CONF_DIR/hive-site.xml, webhcat doesn't see it and defaults it to mr. This prevents Hive jobs running from WebHCat to use the tez execution engine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7268) On Windows Hive jobs in Webhcat always run on default MR mode
[ https://issues.apache.org/jira/browse/HIVE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7268: --- Status: Patch Available (was: Open) On Windows Hive jobs in Webhcat always run on default MR mode - Key: HIVE-7268 URL: https://issues.apache.org/jira/browse/HIVE-7268 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7268.1.patch On Windows fix from HIVE-7065 doesn't work as the templeton.cmd script does not include the Hive configuration directory in the classpath. So when hive.execution.engine property is set to tez in HIVE_CONF_DIR/hive-site.xml, webhcat doesn't see it and defaults it to mr. This prevents Hive jobs running from WebHCat to use the tez execution engine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7268) On Windows Hive jobs in Webhcat always run on default MR mode
[ https://issues.apache.org/jira/browse/HIVE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7268: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the patch, Deepesh, and thanks for the review, Hari! On Windows Hive jobs in Webhcat always run on default MR mode - Key: HIVE-7268 URL: https://issues.apache.org/jira/browse/HIVE-7268 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7268.1.patch On Windows fix from HIVE-7065 doesn't work as the templeton.cmd script does not include the Hive configuration directory in the classpath. So when hive.execution.engine property is set to tez in HIVE_CONF_DIR/hive-site.xml, webhcat doesn't see it and defaults it to mr. This prevents Hive jobs running from WebHCat to use the tez execution engine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan reopened HIVE-7209: (Reopening because of the weird resolution state, intend to close again) allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7209: --- Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) Committed patch 4. Thanks for the patch, Thejas! allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7209: --- Fix Version/s: 0.14.0 allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7209: --- Status: Patch Available (was: Reopened) allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-7209: --- Resolution: Fixed Status: Resolved (was: Patch Available) allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7072) HCatLoader only loads first region of hbase table
[ https://issues.apache.org/jira/browse/HIVE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051947#comment-14051947 ] Sushanth Sowmyan commented on HIVE-7072: [~daijy], could you please review/commit the latest version of this patch? HCatLoader only loads first region of hbase table - Key: HIVE-7072 URL: https://issues.apache.org/jira/browse/HIVE-7072 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-7072.2.patch, HIVE-7072.3.patch Pig needs a config parameter 'pig.noSplitCombination' set to 'true' for it to be able to read HBaseStorageHandler-based tables. This is done in the HBaseLoader at getSplits time, but HCatLoader does not do so, which results in only a partial data load. Thus, we need one more special case definition in HCat, that sets this parameter in the job properties if we detect that we're loading a HBaseStorageHandler based table. (Note, also, that we should not depend directly on the HBaseStorageHandler class, and instead depend on the name of the class, since we do not want a mvn dependency on hive-hbase-handler to be able to compile HCatalog core, since it's conceivable that at some time, there might be a reverse dependency.) The primary issue is one of where this code should go, since it doesn't belong in pig (pig does not know what loader behaviour should be, and this parameter is its interface to a loader), and doesn't belong in the HBaseStorageHandler either, since that's implementing a HiveStorageHandler and is connecting up the two. Thus, this should belong to HCatLoader. Setting this parameter across the board results in poor performance for HCatLoader, so it must only be set when using with HBase. Thus, it belongs in the SpecialCases definition as that was created specifically for these kinds of odd cases, and can be called from within HCatLoader. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
[ https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052003#comment-14052003 ] Lefty Leverenz commented on HIVE-6586: -- HIVE-7209 changes the description of hive.security.metastore.authorization.manager in 0.14.0. Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos) --- Key: HIVE-6586 URL: https://issues.apache.org/jira/browse/HIVE-6586 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Lefty Leverenz Labels: TODOC14 HIVE-6037 puts the definitions of configuration parameters into the HiveConf.java file, but several recent jiras for release 0.13.0 introduce new parameters that aren't in HiveConf.java yet and some parameter definitions need to be altered for 0.13.0. This jira will patch HiveConf.java after HIVE-6037 gets committed. Also, four typos patched in HIVE-6582 need to be fixed in the new HiveConf.java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052006#comment-14052006 ] Lefty Leverenz commented on HIVE-7209: -- For the record: The patch changes the description of *hive.security.metastore.authorization.manager* in hive-default.xml.template (see the release note for new functionality). I added a comment to HIVE-6586 so it won't get lost in the shuffle when HIVE-6037 changes HiveConf.java. allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 16747: Add file pruning into Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16747/ --- (Updated July 4, 2014, 12:13 a.m.) Review request for hive. Changes --- Fixed test fails Bugs: HIVE-1662 https://issues.apache.org/jira/browse/HIVE-1662 Repository: hive-git Description --- now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0256ec9 itests/qtest/testconfiguration.properties 1462ecd itests/util/src/main/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPathName.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java a80feb9 ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 622ee45 ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 29d59a4 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 1095173 ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a9869f7 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 949bcfb ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 683618f ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java c3a83d4 ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 61cc874 ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java 409de7c ql/src/java/org/apache/hadoop/hive/ql/metadata/FilePruningPredicateHandler.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStoragePredicateHandler.java 7d7c764 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/AbstractJoinTaskDispatcher.java 33ef581 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 5c6751c ql/src/java/org/apache/hadoop/hive/ql/parse/MapReduceCompiler.java 703c9d1 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 399f92a ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java f293c43 ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 9945dea ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java f3203bf ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 699b476 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java e7db370 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 41243fe ql/src/test/queries/clientpositive/file_pruning.q PRE-CREATION ql/src/test/results/clientnegative/index_compact_entry_limit.q.out 85614ca ql/src/test/results/clientnegative/index_compact_size_limit.q.out 7c6bb0a ql/src/test/results/clientpositive/file_pruning.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/file_pruning.q.out PRE-CREATION Diff: https://reviews.apache.org/r/16747/diff/ Testing --- Thanks, Navis Ryu
Review Request 23270: Wrong results when union all of grouping followed by group by with correlation optimization
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23270/ --- Review request for hive. Bugs: HIVE-7205 https://issues.apache.org/jira/browse/HIVE-7205 Repository: hive-git Description --- use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Union Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Group By Operator aggregations: expr: sum(_col1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions:
Re: Review Request 23256: HIVE-7345: Beeline changes its prompt to reflect successful database connection even after failing to connect
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23256/#review47332 --- Ship it! Ship It! - Navis Ryu On July 3, 2014, 3:38 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23256/ --- (Updated July 3, 2014, 3:38 a.m.) Review request for hive. Bugs: HIVE-7345 https://issues.apache.org/jira/browse/HIVE-7345 Repository: hive-git Description --- HIVE-7345: Beeline changes its prompt to reflect successful database connection even after failing to connect Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java 2f3350e79f6168b11c13c6b4f84128c9255e0383 beeline/src/java/org/apache/hive/beeline/DatabaseConnection.java 00b49afb72531a4c15d0239ba08b04faa229d262 Diff: https://reviews.apache.org/r/23256/diff/ Testing --- NA Thanks, Ashish Singh
[jira] [Commented] (HIVE-7343) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052041#comment-14052041 ] Navis commented on HIVE-7343: - [~gopalv] Could you verify your identity? Update committer list - Key: HIVE-7343 URL: https://issues.apache.org/jira/browse/HIVE-7343 Project: Hive Issue Type: Test Components: Documentation Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7343.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7345) Beeline changes its prompt to reflect successful database connection even after failing to connect
[ https://issues.apache.org/jira/browse/HIVE-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052042#comment-14052042 ] Navis commented on HIVE-7345: - +1 Beeline changes its prompt to reflect successful database connection even after failing to connect -- Key: HIVE-7345 URL: https://issues.apache.org/jira/browse/HIVE-7345 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7345.patch Beeline changes its prompt to reflect successful database connection even after failing to connect, which is misleading. {code} [asingh@e1118 tpcds]$ beeline -u jdbc:hive2://abclocalhost:1 hive scan complete in 5ms Connecting to jdbc:hive2://abclocalhost:1 Error: Invalid URL: jdbc:hive2://abclocalhost:1 (state=08S01,code=0) Beeline version 0.12.0-cdh5.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://abclocalhost:1 show tables; No current connection {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7303) IllegalMonitorStateException when stmtHandle is null in HiveStatement
[ https://issues.apache.org/jira/browse/HIVE-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052051#comment-14052051 ] Navis commented on HIVE-7303: - [~brocknoland] Implemented just because it's possible. Rollback that part? I'm good. IllegalMonitorStateException when stmtHandle is null in HiveStatement - Key: HIVE-7303 URL: https://issues.apache.org/jira/browse/HIVE-7303 Project: Hive Issue Type: Bug Components: JDBC Reporter: Navis Attachments: HIVE-7303.1.patch.txt From http://www.mail-archive.com/dev@hive.apache.org/msg75617.html Unlock can be called even it's not locked in some situation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7345) Beeline changes its prompt to reflect successful database connection even after failing to connect
[ https://issues.apache.org/jira/browse/HIVE-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh updated HIVE-7345: - Status: Patch Available (was: Open) Beeline changes its prompt to reflect successful database connection even after failing to connect -- Key: HIVE-7345 URL: https://issues.apache.org/jira/browse/HIVE-7345 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7345.patch Beeline changes its prompt to reflect successful database connection even after failing to connect, which is misleading. {code} [asingh@e1118 tpcds]$ beeline -u jdbc:hive2://abclocalhost:1 hive scan complete in 5ms Connecting to jdbc:hive2://abclocalhost:1 Error: Invalid URL: jdbc:hive2://abclocalhost:1 (state=08S01,code=0) Beeline version 0.12.0-cdh5.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://abclocalhost:1 show tables; No current connection {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23253: HIVE-7340: Beeline fails to read a query with comments correctly
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23253/ --- (Updated July 4, 2014, 1 a.m.) Review request for hive. Changes --- Attaching bug id. Bugs: HIVE-7340 https://issues.apache.org/jira/browse/HIVE-7340 Repository: hive-git Description --- HIVE-7340: Beeline fails to read a query with comments correctly Diffs - beeline/src/java/org/apache/hive/beeline/Commands.java 88a94d76a3750dcde31ff47913bf28b827b3b212 itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java 140c1bccedb9ef3c81e89026db44ce4b59150ef4 Diff: https://reviews.apache.org/r/23253/diff/ Testing --- Added unit tests. Thanks, Ashish Singh
[jira] [Updated] (HIVE-7340) Beeline fails to read a query with comments correctly.
[ https://issues.apache.org/jira/browse/HIVE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh updated HIVE-7340: - Status: Patch Available (was: Open) Beeline fails to read a query with comments correctly. --- Key: HIVE-7340 URL: https://issues.apache.org/jira/browse/HIVE-7340 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7340.patch Comment in the beginning of line works: 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . -- comment . . . . . . . . . . . . . . . . * from store . . . . . . . . . . . . . . . . limit 1; but, having comments not in the beginning ignores rest of the query. So, limit 1 is ignored here. 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . * from store -- comment . . . . . . . . . . . . . . . . limit 1; However, this is fine with Hive CLI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
[ https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-7346: --- Assignee: Navis Wrong results caused by hive ppd under specific join condition -- Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin Assignee: Navis Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9 type: int outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator
[jira] [Updated] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
[ https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7346: Attachment: HIVE-7346.1.patch.txt Wrong results caused by hive ppd under specific join condition -- Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin Assignee: Navis Attachments: HIVE-7346.1.patch.txt Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9 type: int outputColumnNames: _col0, _col1, _col2, _col3
Review Request 23271: Wrong results caused by hive ppd under specific join condition
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23271/ --- Review request for hive. Bugs: HIVE-7346 https://issues.apache.org/jira/browse/HIVE-7346 Repository: hive-git Description --- Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9 type: int outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work Stage:
[jira] [Updated] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
[ https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7346: Affects Version/s: 0.13.0 0.13.1 Status: Patch Available (was: Open) Wrong results caused by hive ppd under specific join condition -- Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.13.1, 0.13.0, 0.12.0 Reporter: dima machlin Assignee: Navis Attachments: HIVE-7346.1.patch.txt Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9
[jira] [Commented] (HIVE-7345) Beeline changes its prompt to reflect successful database connection even after failing to connect
[ https://issues.apache.org/jira/browse/HIVE-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052090#comment-14052090 ] Hive QA commented on HIVE-7345: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653750/HIVE-7345.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5691 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/673/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/673/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-673/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653750 Beeline changes its prompt to reflect successful database connection even after failing to connect -- Key: HIVE-7345 URL: https://issues.apache.org/jira/browse/HIVE-7345 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7345.patch Beeline changes its prompt to reflect successful database connection even after failing to connect, which is misleading. {code} [asingh@e1118 tpcds]$ beeline -u jdbc:hive2://abclocalhost:1 hive scan complete in 5ms Connecting to jdbc:hive2://abclocalhost:1 Error: Invalid URL: jdbc:hive2://abclocalhost:1 (state=08S01,code=0) Beeline version 0.12.0-cdh5.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://abclocalhost:1 show tables; No current connection {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.
[ https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7325: Assignee: Navis Status: Patch Available (was: Open) Support non-constant expressions for MAP type indices. -- Key: HIVE-7325 URL: https://issues.apache.org/jira/browse/HIVE-7325 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Mala Chikka Kempanna Assignee: Navis Fix For: 0.14.0 Attachments: HIVE-7325.1.patch.txt Here is my sample: {code} CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); {code} The following join statement doesn't work. {code} SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; {code} FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.
[ https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7325: Attachment: HIVE-7325.1.patch.txt Support non-constant expressions for MAP type indices. -- Key: HIVE-7325 URL: https://issues.apache.org/jira/browse/HIVE-7325 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Mala Chikka Kempanna Fix For: 0.14.0 Attachments: HIVE-7325.1.patch.txt Here is my sample: {code} CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); {code} The following join statement doesn't work. {code} SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; {code} FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-1955) Support non-constant expressions for array indexes.
[ https://issues.apache.org/jira/browse/HIVE-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-1955: --- Assignee: Navis Support non-constant expressions for array indexes. --- Key: HIVE-1955 URL: https://issues.apache.org/jira/browse/HIVE-1955 Project: Hive Issue Type: Improvement Reporter: Adam Kramer Assignee: Navis FAILED: Error in semantic analysis: line 4:8 Non Constant Expressions for Array Indexes not Supported dut ...just wrote my own UDF to do this, and it is trivial. We should support this natively. Let foo have these rows: arr i [1,2,3] 1 [3,4,5] 2 [5,4,3] 2 [0,0,1] 0 Then, SELECT arr[i] FROM foo should return: 2 5 3 1 Similarly, for the same table, SELECT 3 IN arr FROM foo should return: true true true false ...these use cases are needless limitations of functionality. We shouldn't need UDFs to accomplish these goals. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7340) Beeline fails to read a query with comments correctly.
[ https://issues.apache.org/jira/browse/HIVE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052120#comment-14052120 ] Hive QA commented on HIVE-7340: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653735/HIVE-7340.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5692 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/674/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/674/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-674/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653735 Beeline fails to read a query with comments correctly. --- Key: HIVE-7340 URL: https://issues.apache.org/jira/browse/HIVE-7340 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7340.patch Comment in the beginning of line works: 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . -- comment . . . . . . . . . . . . . . . . * from store . . . . . . . . . . . . . . . . limit 1; but, having comments not in the beginning ignores rest of the query. So, limit 1 is ignored here. 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . * from store -- comment . . . . . . . . . . . . . . . . limit 1; However, this is fine with Hive CLI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
[ https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7314: Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Ashutosh, for the review! Wrong results of UDF when hive.cache.expr.evaluation is set --- Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Fix For: 0.14.0 Attachments: HIVE-7314.1.patch.txt It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052146#comment-14052146 ] Navis commented on HIVE-7326: - [~hsubramaniyan] Sorry. I didn't noticed that this is assigned already. Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7326.1.patch.txt CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7326: Status: Patch Available (was: Open) Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7326.1.patch.txt CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7326: Attachment: HIVE-7326.1.patch.txt Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7326.1.patch.txt CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7343) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7343: -- Attachment: (was: HIVE-7343.2.patch) Update committer list - Key: HIVE-7343 URL: https://issues.apache.org/jira/browse/HIVE-7343 Project: Hive Issue Type: Test Components: Documentation Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7343.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7343) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7343: -- Attachment: HIVE-7343.2.patch I can confirm that's the right name and org information. But the mark-down file threw warnings for me. {code} line 258 column 1 - Warning: missing tr line 272 column 1 - Warning: missing tr line 338 column 1 - Warning: missing tr line 344 column 1 - Warning: missing tr line 350 column 1 - Warning: missing tr {code} I have fixed these warnings as well, in attached patch. Update committer list - Key: HIVE-7343 URL: https://issues.apache.org/jira/browse/HIVE-7343 Project: Hive Issue Type: Test Components: Documentation Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7343.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7343) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7343: -- Attachment: HIVE-7343.2.patch Update committer list - Key: HIVE-7343 URL: https://issues.apache.org/jira/browse/HIVE-7343 Project: Hive Issue Type: Test Components: Documentation Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7343.2.patch, HIVE-7343.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7345) Beeline changes its prompt to reflect successful database connection even after failing to connect
[ https://issues.apache.org/jira/browse/HIVE-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052150#comment-14052150 ] Ashish Kumar Singh commented on HIVE-7345: -- [~navis] thanks for reviewing the fix. The test errors do not look related to the fix. Beeline changes its prompt to reflect successful database connection even after failing to connect -- Key: HIVE-7345 URL: https://issues.apache.org/jira/browse/HIVE-7345 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7345.patch Beeline changes its prompt to reflect successful database connection even after failing to connect, which is misleading. {code} [asingh@e1118 tpcds]$ beeline -u jdbc:hive2://abclocalhost:1 hive scan complete in 5ms Connecting to jdbc:hive2://abclocalhost:1 Error: Invalid URL: jdbc:hive2://abclocalhost:1 (state=08S01,code=0) Beeline version 0.12.0-cdh5.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://abclocalhost:1 show tables; No current connection {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7340) Beeline fails to read a query with comments correctly.
[ https://issues.apache.org/jira/browse/HIVE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052152#comment-14052152 ] Ashish Kumar Singh commented on HIVE-7340: -- The test errors do not look related to the fix. Beeline fails to read a query with comments correctly. --- Key: HIVE-7340 URL: https://issues.apache.org/jira/browse/HIVE-7340 Project: Hive Issue Type: Bug Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Attachments: HIVE-7340.patch Comment in the beginning of line works: 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . -- comment . . . . . . . . . . . . . . . . * from store . . . . . . . . . . . . . . . . limit 1; but, having comments not in the beginning ignores rest of the query. So, limit 1 is ignored here. 0: jdbc:hive2://localhost:1 select . . . . . . . . . . . . . . . . * from store -- comment . . . . . . . . . . . . . . . . limit 1; However, this is fine with Hive CLI. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
[ https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052153#comment-14052153 ] Hive QA commented on HIVE-7346: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654024/HIVE-7346.1.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5692 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/675/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/675/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-675/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654024 Wrong results caused by hive ppd under specific join condition -- Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Attachments: HIVE-7346.1.patch.txt Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0