[jira] [Updated] (HIVE-4135) When using PostgreSQL as stats database ,there's a type cast bug
[ https://issues.apache.org/jira/browse/HIVE-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4135: - Attachment: HIVE-4135-1.patch A quick fix When using PostgreSQL as stats database ,there's a type cast bug --- Key: HIVE-4135 URL: https://issues.apache.org/jira/browse/HIVE-4135 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.9.0, 0.10.0 Reporter: caofangkun Priority: Minor Attachments: HIVE-4135-1.patch When using PostgreSQL as stats database ,there's a type cast bug. tasktrack log : 2013-03-01 16:03:08,973 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 17598 rows 2013-03-01 16:03:09,040 INFO org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Stats publishing for key dim_pub_date/00 2013-03-01 16:03:09,045 ERROR org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher: Error during publishing statistics. org.postgresql.util.PSQLException: ERROR: column row_count is of type bigint but expression is of type character varying Hint: You will need to rewrite or cast the expression. Position: 126 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2101) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1834) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:510) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:332) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:136) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher$2.run(JDBCStatsPublisher.java:133) at org.apache.hadoop.hive.ql.exec.Utilities.executeWithRetry(Utilities.java:2093) at org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher.publishStat(JDBCStatsPublisher.java:149) at org.apache.hadoop.hive.ql.exec.TableScanOperator.publishStats(TableScanOperator.java:260) at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:198) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:373) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) at org.apache.hadoop.mapred.Child.main(Child.java:167) 2013-03-01 16:03:09,046 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: publishing : dim_pub_date/00 : {numRows=17598, rawDataSize=2626518} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4134) Facing issues while executing commands on hive shell. The system throws following error: only on Windows Cygwin setup
[ https://issues.apache.org/jira/browse/HIVE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595673#comment-13595673 ] John Gordon commented on HIVE-4134: --- This looks like you have a patch or are missing a patch -- Hive is trying to create a URI path through string concatenation instead of using the APIs in Uri and Path. The patch on HIVE-3319 has a lot of one-line fixes for these kinds of issues - including this change: - String testFileDir = file:// - + conf.get(test.data.files).replace('\\', '/').replace(c:, ); + String testFileDir = new Path(conf.get(test.data.files)).toUri().getPath(); Effectively, the fixed line is just using Path.toUri() to initialize the full context needed, then converting back to Path to get a usable serialization. It could also be the temp path string literal in your hive-site.xml, which doesn't appear to be expressed in the way Cygwin would do it. If it runs fine under the exact same configuration outside of the Cygwin environment, then this is the case. You shouldn't need Cygwin to run at all. Facing issues while executing commands on hive shell. The system throws following error: only on Windows Cygwin setup -- Key: HIVE-4134 URL: https://issues.apache.org/jira/browse/HIVE-4134 Project: Hive Issue Type: Task Environment: hadop1.1.1 cygwin hive-0.8.1 Reporter: Avinash Singh When executing this command $ bin/hive -e 'SHOW TABLES' WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Logging initialized using configuration in jar:file:/C:/cygwin/home/Administrator/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties Hive history file=/tmp/Administrator/hive_job_log_Administrator_201303071139_1504829167.txt Patch for HADOOP-7682: Instantiating workaround file system FAILED: Hive Internal Error: java.lang.IllegalArgumentException(java.net.URISyntaxException: Relative path in absolute URI: file:C:/cygwin/tmp//Administrator/hive_2013-03-07_11-39-26_427_243610179348269546) java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:C:/cygwin/tmp//Administrator/hive_2013-03-07_11-39-26_427_243610179348269546 at org.apache.hadoop.fs.Path.initialize(Path.java:148) at org.apache.hadoop.fs.Path.init(Path.java:132) at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:160) at org.apache.hadoop.hive.ql.Context.getLocalScratchDir(Context.java:187) at org.apache.hadoop.hive.ql.Context.getLocalTmpFileURI(Context.java:303) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:338) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:637) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:C:/cygwin/tmp//Administrator/hive_2013-03-07_11-39-26_427_243610179348269546 at java.net.URI.checkPath(URI.java:1788) at java.net.URI.init(URI.java:734) at org.apache.hadoop.fs.Path.initialize(Path.java:145) ... 20 more tried using the bug fix as describe in https://issues.apache.org/jira/browse/HIVE-2388 but his also not worked bin/start.sh WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. Exception in thread main java.lang.RuntimeException: Failed to initialize default Hive configuration variables! at
[jira] [Commented] (HIVE-3297) change hive.auto.convert.join's default value to true
[ https://issues.apache.org/jira/browse/HIVE-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595707#comment-13595707 ] Hudson commented on HIVE-3297: -- Integrated in Hive-trunk-h0.21 #2004 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2004/]) HIVE-3297 : change hive.auto.convert.joins default value to true (Ashutosh Chauhan) (Revision 1453649) Result = SUCCESS hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453649 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/data/conf/hive-site.xml change hive.auto.convert.join's default value to true - Key: HIVE-3297 URL: https://issues.apache.org/jira/browse/HIVE-3297 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Ashutosh Chauhan Fix For: 0.11.0 Attachments: HIVE-3297.patch For unit tests also, this parameter should be set to true. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3963) Allow Hive to connect to RDBMS
[ https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxime LANCIAUX updated HIVE-3963: -- Affects Version/s: 0.11.0 0.9.0 Allow Hive to connect to RDBMS -- Key: HIVE-3963 URL: https://issues.apache.org/jira/browse/HIVE-3963 Project: Hive Issue Type: New Feature Components: Import/Export, JDBC, SQL, StorageHandler Affects Versions: 0.9.0, 0.10.0, 0.9.1, 0.11.0 Reporter: Maxime LANCIAUX I am thinking about something like : SELECT jdbcload('driver','url','user','password','sql') FROM dual; There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for JDBCStorageHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4099) Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
[ https://issues.apache.org/jira/browse/HIVE-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zafar Gilani updated HIVE-4099: --- Description: Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails with unexplained errors. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried, but in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) was: Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask - Key: HIVE-4099 URL: https://issues.apache.org/jira/browse/HIVE-4099 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Environment: GNU/Linux x86_64, kernel 2.6.32-131.0.15.e16.x86_64, 16 cores, 48 GB main memory, 16 mappers, 8 reducers, mapred.java.child.opts set to 2g. Reporter: Zafar Gilani Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails with unexplained errors. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried, but in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4099) Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
[ https://issues.apache.org/jira/browse/HIVE-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zafar Gilani updated HIVE-4099: --- Description: Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails with errors. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried, but in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) was: Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails with unexplained errors. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried, but in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask - Key: HIVE-4099 URL: https://issues.apache.org/jira/browse/HIVE-4099 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Environment: GNU/Linux x86_64, kernel 2.6.32-131.0.15.e16.x86_64, 16 cores, 48 GB main memory, 16 mappers, 8 reducers, mapred.java.child.opts set to 2g. Reporter: Zafar Gilani Join query fails with Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask. hive.log: ERROR exec.MapredLocalTask (SessionState.java:printError(365)) ERROR ql.Driver (SessionState.java:printError(365)) Select and insert queries work fine. Simplest of join fails with errors. Data-set size: Two tables being joined, have 27k records each, each record having three fields. Already tried, but in vain: - Add contrib jar to the hive classpath - Set Hadoop mapred.child.java.opts to 2 to 8g of memory - Set Hive mapred.child.java.opts to 2 to 8g of memory - Set hive.auto.convert.join to true (regular join to mapjoin) - Set hive.optimize.skewjoin to true (handle skewness in data) - Set hive.mapjoin.maxsize to 100 (small table rows, both tables have 27k rows) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-3963
Hello ! I am working on https://issues.apache.org/jira/browse/HIVE-3963 to allow Hive's users to get data from databases to do join with big Hadoop/Hive table and small/reference table. So I have coded a UDTF LoadFromJDBC and it is working well. But I am sure it can be improved a lot ! I am looking for any comments/advices/help ! Thanks. -- Maxime LANCIAUX http://maximelanciauxbi.blogspot.fr/
[jira] [Created] (HIVE-4136) hive should optimize the scenario when the input and output are bucketed/sorted on the same keys
Namit Jain created HIVE-4136: Summary: hive should optimize the scenario when the input and output are bucketed/sorted on the same keys Key: HIVE-4136 URL: https://issues.apache.org/jira/browse/HIVE-4136 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Consider a common scenario like: create table T1 (...) clustered by (key) sorted by (key) into 2 buckets; create table T2 (...) clustered by (key) sorted by (key) into 2 buckets; SET hive.enforce.sorting=true; SET hive.enforce.bucketing=true; insert overwrite table T2 select * from T1; The above query creates a reducer to make sure T2 is bucketed/sorted. That is not needed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4137) optimize group by followed by joins for bucketed/sorted tables
Namit Jain created HIVE-4137: Summary: optimize group by followed by joins for bucketed/sorted tables Key: HIVE-4137 URL: https://issues.apache.org/jira/browse/HIVE-4137 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Consider the following scenario: create table T1 (...) clustered by (key) sorted by (key) into 2 buckets; create table T2 (...) clustered by (key) sorted by (key) into 2 buckets; create table T3 (...) clustered by (key) sorted by (key) into 2 buckets; SET hive.enforce.sorting=true; SET hive.enforce.bucketing=true; insert overwrite table T3 select .. from (select key, aggr() from T1 group by key) s1 full outer join (select key, aggr() from T2 group by key) s2 on s1.key=s2.ley; Ideally, this query can be performed in a single map-only job. Group By - SortMerge Join. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
[ https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HIVE-4138: --- Assignee: Owen O'Malley ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils -- Key: HIVE-4138 URL: https://issues.apache.org/jira/browse/HIVE-4138 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
Owen O'Malley created HIVE-4138: --- Summary: ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils Key: HIVE-4138 URL: https://issues.apache.org/jira/browse/HIVE-4138 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4126) remove support for lead/lag UDFs outside of UDAF args
[ https://issues.apache.org/jira/browse/HIVE-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596058#comment-13596058 ] Phabricator commented on HIVE-4126: --- ashutoshc has commented on the revision HIVE-4126 [jira] remove support for lead/lag UDFs outside of UDAF args. Changes look good. But following query in windowing_expressions.q fails select p_mfgr, p_retailprice, p_size, rank() as r, lag(rank(),1) as pr, sum(p_retailprice) as s2 over (rows between unbounded preceding and current row), sum(p_retailprice) - 5 as s1 over (rows between unbounded preceding and current row) from part distribute by p_mfgr sort by p_retailprice; Looks like need to update the query. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:10137 Do we need to update this comment blocks. Function now does lot less than whats in comment. We probably need to update comments of other functions as well. REVISION DETAIL https://reviews.facebook.net/D9105 To: JIRA, ashutoshc, hbutani remove support for lead/lag UDFs outside of UDAF args - Key: HIVE-4126 URL: https://issues.apache.org/jira/browse/HIVE-4126 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4126.D9105.1.patch Select Expressions such as p_size - lead(p_size,1) are currently handled as non aggregation expressions done after all over clauses are evaluated. Once we allow different partitions in a single select list(Jira 4041), these become ambiguous. - the equivalent way to do such things is either to use lead/lag UDAFs with expressions ( support added with Jira 4081) - stack windowing clauses with inline queries. select lead(r,1).. from (select rank() as r)... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4126) remove support for lead/lag UDFs outside of UDAF args
[ https://issues.apache.org/jira/browse/HIVE-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596061#comment-13596061 ] Ashutosh Chauhan commented on HIVE-4126: Also, following query in leadlag.q also failed: select p1.p_mfgr, p1.p_name, p1.p_size, p1.p_size - lag(p1.p_size,1,p_size) as deltaSz from part p1 join part p2 on p1.p_partkey = p2.p_partkey distribute by p1.p_mfgr sort by p1.p_name ; FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: Column p_size Found in more than One Tables/Subqueries remove support for lead/lag UDFs outside of UDAF args - Key: HIVE-4126 URL: https://issues.apache.org/jira/browse/HIVE-4126 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4126.D9105.1.patch Select Expressions such as p_size - lead(p_size,1) are currently handled as non aggregation expressions done after all over clauses are evaluated. Once we allow different partitions in a single select list(Jira 4041), these become ambiguous. - the equivalent way to do such things is either to use lead/lag UDAFs with expressions ( support added with Jira 4081) - stack windowing clauses with inline queries. select lead(r,1).. from (select rank() as r)... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4131) Fix eclipse template classpath to include new packages added by ORC file patch
[ https://issues.apache.org/jira/browse/HIVE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596108#comment-13596108 ] Kevin Wilfong commented on HIVE-4131: - +1 Fix eclipse template classpath to include new packages added by ORC file patch -- Key: HIVE-4131 URL: https://issues.apache.org/jira/browse/HIVE-4131 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.11.0 Attachments: HIVE-4131-1.patch The ORC file feature (HIVE-3874) has added protobuf and snappy libraries, also generated protobuf code. All these needs to be included in the eclipse classpath template. The eclipse projected generated on latest trunk has build errors due to the missing jar/classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4125) Expose metastore JMX metrics
[ https://issues.apache.org/jira/browse/HIVE-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596113#comment-13596113 ] Kevin Wilfong commented on HIVE-4125: - Comments on Phabricator Expose metastore JMX metrics Key: HIVE-4125 URL: https://issues.apache.org/jira/browse/HIVE-4125 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.11.0 Reporter: Samuel Yuan Assignee: Samuel Yuan Priority: Trivial Attachments: HIVE-4125.HIVE-4125.HIVE-4125.D9123.1.patch Add a safe way to access the metrics stored for each MetricsScope, so that they can be used outside of JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4125) Expose metastore JMX metrics
[ https://issues.apache.org/jira/browse/HIVE-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596114#comment-13596114 ] Phabricator commented on HIVE-4125: --- kevinwilfong has commented on the revision HIVE-4125 [jira] Expose metastore JMX metrics. INLINE COMMENTS common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java:197-202 Could you just use this method instead of implementing it again inside of MetricsScope? common/src/java/org/apache/hadoop/hive/common/metrics/Metrics.java:71 Why not expose this too for consistency? REVISION DETAIL https://reviews.facebook.net/D9123 To: kevinwilfong, sxyuan Cc: JIRA Expose metastore JMX metrics Key: HIVE-4125 URL: https://issues.apache.org/jira/browse/HIVE-4125 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 0.11.0 Reporter: Samuel Yuan Assignee: Samuel Yuan Priority: Trivial Attachments: HIVE-4125.HIVE-4125.HIVE-4125.D9123.1.patch Add a safe way to access the metrics stored for each MetricsScope, so that they can be used outside of JMX. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by
[ https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596119#comment-13596119 ] Harish Butani commented on HIVE-4108: - A question I have is what should the behavior be when a Query level distribute and/or sort is specified: So for this query {noformat} select sum(x) over() from t1 distribute by x sort by y {noformat} Should the partition for sum be the entire table or be based on x. Today we support this query: {noformat} select sum(x) from t1 distribute by x sort by y {noformat} we infer the partition for sum to be x. We also support this {noformat} select sum(x) over(row unbounded preceding and current row) from t1 distribute by x sort by y {noformat} again we infer the partition for sum to be x. So: - either we remove the concept of inferring from the Query level distribute/sort - or a missing partition in an Over clause should imply the entire table only when there is no Query level distribute/sort Does this make sense? Allow over() clause to contain an order by with no partition by --- Key: HIVE-4108 URL: https://issues.apache.org/jira/browse/HIVE-4108 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland HIVE-4073 allows over() to be called with no partition by and no order by. We should allow only an order by. From the review of HIVE-4073: Ashutosh {noformat} Can you also add following test. This should also work. select p_name, p_retailprice, avg(p_retailprice) over(order by p_name) from part partition by p_name; {noformat} Harish {noformat} This test will not work (: The grammar needs to be changed so: partitioningSpec @init { msgs.push(partitioningSpec clause); } @after { msgs.pop(); } : partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?) | orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) | distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?) | sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) | clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause) ; And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST Tree. The PTFTranslator also needs changes. Do this as another Jira {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4120) Implement decimal encoding for ORC
[ https://issues.apache.org/jira/browse/HIVE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4120: -- Attachment: HIVE-4120.D9207.1.patch omalley requested code review of HIVE-4120 [jira] Implement decimal encoding for ORC. Reviewers: JIRA hive-4120 add decimal encoding for orc Currently, ORC does not have an encoder for decimal. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D9207 AFFECTED FILES ql/src/gen/protobuf/gen-java/org/apache/hadoop/hive/ql/io/orc/OrcProto.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/DecimalColumnStatistics.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestSerializationUtils.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/SerializationUtils.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestStringRedBlackTree.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/22209/ To: JIRA, omalley Implement decimal encoding for ORC -- Key: HIVE-4120 URL: https://issues.apache.org/jira/browse/HIVE-4120 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4120.D9207.1.patch Currently, ORC does not have an encoder for decimal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4120) Implement decimal encoding for ORC
[ https://issues.apache.org/jira/browse/HIVE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4120: Status: Patch Available (was: Open) Implement decimal encoding for ORC -- Key: HIVE-4120 URL: https://issues.apache.org/jira/browse/HIVE-4120 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4120.D9207.1.patch Currently, ORC does not have an encoder for decimal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596179#comment-13596179 ] Carl Steinbach commented on HIVE-2935: -- @Ashutosh: Thanks for approving this. Since it's really big patch I wanted to suggest that we split it into three separate commits in order to make it easier for people to diff it in the future: # Code changes # Thrift generated code # Test outputs Let me know if you want help with this. Thanks. Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596223#comment-13596223 ] Ashutosh Chauhan commented on HIVE-2935: Yeah.. I thinks thats a good idea to break it down 3 ways as you suggested. [~prasadm] Can you break it down three ways as Carl suggested? Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4042) ignore mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596241#comment-13596241 ] Namit Jain commented on HIVE-4042: -- [~ashutoshc], I am assuming you are talking about the following query: {noformat} select /*+MAPJOIN(smallTableTwo)*/ idOne, idTwo, value FROM ( select /*+MAPJOIN(smallTableOne)*/ idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo) {noformat} without the mapjoin hints. This query will remain the same. The query: {code} select /*+ MAPJOIN(t2) */ * from t1 join t2 on t1.t11 = t2.t21 group by t1.t12; {code} will have an extra MR job (I need to verify that), which should go away by setting hive.auto.convert.join.noconditionaltask to true. Let me test that. ignore mapjoin hint --- Key: HIVE-4042 URL: https://issues.apache.org/jira/browse/HIVE-4042 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4042.1.patch, hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch After HIVE-3784, in a production environment, it can become difficult to deploy since a lot of production queries can break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by
[ https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596254#comment-13596254 ] Ashutosh Chauhan commented on HIVE-4108: Your table looks good. It will be good to add it in the comments in source code as well. Allow over() clause to contain an order by with no partition by --- Key: HIVE-4108 URL: https://issues.apache.org/jira/browse/HIVE-4108 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland HIVE-4073 allows over() to be called with no partition by and no order by. We should allow only an order by. From the review of HIVE-4073: Ashutosh {noformat} Can you also add following test. This should also work. select p_name, p_retailprice, avg(p_retailprice) over(order by p_name) from part partition by p_name; {noformat} Harish {noformat} This test will not work (: The grammar needs to be changed so: partitioningSpec @init { msgs.push(partitioningSpec clause); } @after { msgs.pop(); } : partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?) | orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) | distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?) | sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) | clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause) ; And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST Tree. The PTFTranslator also needs changes. Do this as another Jira {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
[ https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4138: Component/s: Serializers/Deserializers ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils -- Key: HIVE-4138 URL: https://issues.apache.org/jira/browse/HIVE-4138 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
[ https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4138: -- Attachment: HIVE-4138.D9219.1.patch omalley requested code review of HIVE-4138 [jira] ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils. Reviewers: JIRA hive-4138 fix union object inspector typename Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D9219 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcUnion.java ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/22233/ To: JIRA, omalley ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils -- Key: HIVE-4138 URL: https://issues.apache.org/jira/browse/HIVE-4138 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4138.D9219.1.patch Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils
[ https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4138: Status: Patch Available (was: Open) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils -- Key: HIVE-4138 URL: https://issues.apache.org/jira/browse/HIVE-4138 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-4138.D9219.1.patch Currently the typename returned by ORC's union object inspector isn't parseable by TypeInfoUtils. The format needs to be uniontype1,type2,type3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by
[ https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596280#comment-13596280 ] Ashutosh Chauhan commented on HIVE-4108: I think we should remove the concept of inference from query level distribute/sort. For your first query I will read that as user first intends to do a partition on a constant using full table (which will be first MR job) and than wants second partitioning on x (2nd MR job) which as you pointed out is different than current behavior. For your second query, my read will be same as previous which again deviates from implementation. For third query, same ambiguity. So, in all 3 cases current behavior is different than what I would have expected. Automatic inference is nasty. IMO we should drop it all together. Distribute/Sort if present in query shouldn't impact any over() clause specified in the query. Whenever they are present that will just imply user wants another MR job using that spec (which was the behavior in HIVE before this work). Allow over() clause to contain an order by with no partition by --- Key: HIVE-4108 URL: https://issues.apache.org/jira/browse/HIVE-4108 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland HIVE-4073 allows over() to be called with no partition by and no order by. We should allow only an order by. From the review of HIVE-4073: Ashutosh {noformat} Can you also add following test. This should also work. select p_name, p_retailprice, avg(p_retailprice) over(order by p_name) from part partition by p_name; {noformat} Harish {noformat} This test will not work (: The grammar needs to be changed so: partitioningSpec @init { msgs.push(partitioningSpec clause); } @after { msgs.pop(); } : partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?) | orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) | distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?) | sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) | clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause) ; And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST Tree. The PTFTranslator also needs changes. Do this as another Jira {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4139) MiniDFS shim does not work for hadoop 2
Gunther Hagleitner created HIVE-4139: Summary: MiniDFS shim does not work for hadoop 2 Key: HIVE-4139 URL: https://issues.apache.org/jira/browse/HIVE-4139 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner There's an incompatibility between hadoop 1 2 wrt to the MiniDfsCluster class. That causes the hadoop 2 line Minimr tests to fail with a MethodNotFound exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4139) MiniDFS shim does not work for hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4139: - Attachment: HIVE-4139.1.patch Separate shim classes for 20S an 23. Also fixed some dependency issues (yarn-server-tests etc) MiniDFS shim does not work for hadoop 2 --- Key: HIVE-4139 URL: https://issues.apache.org/jira/browse/HIVE-4139 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4139.1.patch There's an incompatibility between hadoop 1 2 wrt to the MiniDfsCluster class. That causes the hadoop 2 line Minimr tests to fail with a MethodNotFound exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by
[ https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596337#comment-13596337 ] Harish Butani commented on HIVE-4108: - Yes I have to agree. Seemed like a good idea, but it is not. It gets more nebulous with multiple partitions support (4041) Will add a separate Jira to remove the current behavior first. Allow over() clause to contain an order by with no partition by --- Key: HIVE-4108 URL: https://issues.apache.org/jira/browse/HIVE-4108 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland HIVE-4073 allows over() to be called with no partition by and no order by. We should allow only an order by. From the review of HIVE-4073: Ashutosh {noformat} Can you also add following test. This should also work. select p_name, p_retailprice, avg(p_retailprice) over(order by p_name) from part partition by p_name; {noformat} Harish {noformat} This test will not work (: The grammar needs to be changed so: partitioningSpec @init { msgs.push(partitioningSpec clause); } @after { msgs.pop(); } : partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?) | orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) | distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?) | sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) | clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause) ; And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST Tree. The PTFTranslator also needs changes. Do this as another Jira {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3985) Update new UDAFs introduced for Windowing to work with new Decimal Type
[ https://issues.apache.org/jira/browse/HIVE-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-3985: --- Attachment: HIVE-3985-0.patch Update new UDAFs introduced for Windowing to work with new Decimal Type --- Key: HIVE-3985 URL: https://issues.apache.org/jira/browse/HIVE-3985 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Brock Noland Attachments: HIVE-3985-0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-2935: -- Attachment: HIVE-2935-5.thrift-gen.patch HIVE-2935-5.core-hs2.patch HIVE-2935-5.beeline.patch Patch split into 3 parts - HIVE-2935-5.core-hs2.patch - Core HS2 changes HIVE-2935-5.beeline.patch - Beeline,tests etc HIVE-2935-5.thrift-gen.patch - Thrift generated code Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3985) Update new UDAFs introduced for Windowing to work with new Decimal Type
[ https://issues.apache.org/jira/browse/HIVE-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-3985: --- Status: Patch Available (was: Open) Update new UDAFs introduced for Windowing to work with new Decimal Type --- Key: HIVE-3985 URL: https://issues.apache.org/jira/browse/HIVE-3985 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Brock Noland Attachments: HIVE-3985-0.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596360#comment-13596360 ] Prasad Mujumdar commented on HIVE-2935: --- [~ashutoshc] The patch is rebased as of latest on trunk (commit c219de2d33820c1d66873283ec457a64f3aa4ea7). Its split into three patches as Carl suggested. Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4041) Support multiple partitionings in a single Query
[ https://issues.apache.org/jira/browse/HIVE-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596406#comment-13596406 ] Ashutosh Chauhan commented on HIVE-4041: Went through the design doc. Thanks, Harish for writing this up. In light of discussion over on HIVE-4108 you may want to update section 3. Also, section 4 needs to be updated because of it. I believe because of dropping of implicit partitioning concept, design (and thus implementation) will be cleaner (and thus easier), which is a good thing. Support multiple partitionings in a single Query Key: HIVE-4041 URL: https://issues.apache.org/jira/browse/HIVE-4041 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: WindowingComponentization.pdf Currently we disallow queries if the partition specifications of all Wdw fns are not the same. We can relax this by generating multiple PTFOps based on the unique partitionings in a Query. For partitionings that only differ in sort, we can introduce a sort step in between PTFOps, which can happen in the same Reduce task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4140) Specifying alias for windowing function
Ashutosh Chauhan created HIVE-4140: -- Summary: Specifying alias for windowing function Key: HIVE-4140 URL: https://issues.apache.org/jira/browse/HIVE-4140 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Ashutosh Chauhan Currently we support select rank() as rnk over (...) from .. whereas sql standard dictates select rank() over (...) as rnk from .. is standard -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4140) Specifying alias for windowing function
[ https://issues.apache.org/jira/browse/HIVE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596418#comment-13596418 ] Ashutosh Chauhan commented on HIVE-4140: Further, {{as}} is optional in such cases. Specifying alias for windowing function --- Key: HIVE-4140 URL: https://issues.apache.org/jira/browse/HIVE-4140 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Ashutosh Chauhan Currently we support select rank() as rnk over (...) from .. whereas sql standard dictates select rank() over (...) as rnk from .. is standard -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #313
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/313/
[jira] [Commented] (HIVE-3963) Allow Hive to connect to RDBMS
[ https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596639#comment-13596639 ] Lianhui Wang commented on HIVE-3963: i think that must support as clause like transform syntax. for example: SELECT jdbcload('driver','url','user','password','sql') as c1,c2 FROM dual; Allow Hive to connect to RDBMS -- Key: HIVE-3963 URL: https://issues.apache.org/jira/browse/HIVE-3963 Project: Hive Issue Type: New Feature Components: Import/Export, JDBC, SQL, StorageHandler Affects Versions: 0.9.0, 0.10.0, 0.9.1, 0.11.0 Reporter: Maxime LANCIAUX I am thinking about something like : SELECT jdbcload('driver','url','user','password','sql') FROM dual; There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for JDBCStorageHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4126) remove support for lead/lag UDFs outside of UDAF args
[ https://issues.apache.org/jira/browse/HIVE-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4126: -- Attachment: HIVE-4126.D9105.2.patch hbutani updated the revision HIVE-4126 [jira] remove support for lead/lag UDFs outside of UDAF args. - add PTF clause to grammar - PTF Test Queries - Query Data - corrected data file name - Merge branch 'hive-896' of github.com:hbutani/hive into hive-896 - windowing + hive attempt - Hooking QueryDef to QB - ptf source is a subQuery, not a select statement - add windowing clauses in grammar - fix grammar exception issue - associate PTF nodes with corresponding Insert node, if any - minor grammar fixes - flush out processing PTF tree in phase1 - associate PTFs in dest node handling in Phase1 - Classes not needed - Merge branch 'crook' of github.com:hbutani/hive into crook - Merge with apache hive - handle SortBy, Having and Window clauses in Phase1 - remove ambiguities in Hive.g - Merge branch 'crook' of https://github.com/hbutani/hive into crook - syntactically allow a window specification in a selectItem - tweak QuerySpec building. - check that there is no GBy and where when deciding if a Windowing - Merge branch 'crook' of https://github.com/hbutani/hive into crook - Add several checks: - QSpec to QDef (Checked qdef serialization and deserialization: it works) - refactor the PTF ifc. - refactor: rename annotation classes to end in Description - refactor: move annotation classes to ql.exec package, where the - refactor: move window functions to ql.udf.generic package - refcator: move GenericUDFLeadLag to ql.udf.generic - refactor: move TablFunc bases classes to ql.udf.ptf - refactor: move PTblFuncs to ql.udf.ptf - refactor: move Order enum to ptf.query.spec package, so that i can - refactor: remove classes in ptf.metadata package. Not needed now - refactor: FunctionRegistry, extract FunctionInfo classes; step 1 in - fix logic that checks for windowing specifications in Select List - reenable ensurePTFChainHasPartitioning; - Added aliasToAST map: To setup expressions map in PTF's output - Added AST expression: Used to populate PTF RowResolver's expression map - Using input operator's RowResolver to construct OI for HiveTableDef - Use of SCRIPT DependencyType for processing rule on PTFOperator. - Cleanup: Changed prototype of translate method in Translator. Commented - Add utility methods: - Add method to get operator name in PTFOperator. - 1. Translation of QuerySpec to QueryDef - Merge branch 'hive-896' into crook - Following changes: - refactor genPTFPlan: - minor bug: clear Agg. and Distinct Agg. lists in QB ParseInfo. - flush out SelectDef translation: - introduce initializeOutputOI and initializeRawInputOI to TableFunction - When constructing the RowResolver for the Windowing or Noop PTFs: - add the columns from the last PTF, before adding any - 1. Change logic of how/which TableFunc is added to a QuerySpec: if query - During QDef deserialization use the passed in inputOI as the OI of the - for subQueries as input to PTF, construct a HiveTableSpec. - Tests successful for queries with: windowing, lead/lag, noop, gby, having, join with lead/lag, join with noop - support an alias for a PTF invocation. This is needed so that a PTF - add test for alias in ptf invocation - translate ptf invocations in the from clause (that are not associated - add tests that exercise generation of separate PTFOps for PTFChain and - Fix aggregations bug: move aggregation expressions from aggregationTrees to PTF QuerySpec if no group by clause is seen at the end of phase 1 - add support for PTF invocation in joins. - adding tests for ptf invocation in joins - handle mixed case aliases. - mixed case alias test - Create PTF Map-side RR: - Merge branch 'hive-896' into crook - fix Hive.g merge issue: duplicate KW_ROWS definition - Having: Tests to support having with windowing and ptf in queries with no group by. - - during handleClusterOrDistributeByForWindowing invoke the - Merge branch 'crook' of https://github.com/hbutani/hive into crook - add tests to check - when extracting Windowing clauses from selectList handle the case - when extracting Windowing clauses from selectList handle the case - More tests with UDAFs, statistical and distribution functions - No need to specify Writable option to copy object. - Following changes: - Tests: - disallow Count/Sum distinct with windowing - refactor ptf.translate package: - Merge branch 'ptf' of https://github.com/hbutani/hive into ptf - refactor ptf.query.specification package - Merge branch 'ptf' of
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596653#comment-13596653 ] Thejas M Nair commented on HIVE-2935: - bq. HIVE-2935-5.beeline.patch - Beeline,tests etc [~prasadm] This file does not have beeline test benchmark files. Are the test files from the old beelinepositive.tar.gz the ones that need to get checked in ? Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 2005 - Failure
Changes for Build #2005 1 tests failed. REGRESSION: org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1 Error Message: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. Stack Trace: junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. at net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259) at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268) at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:299) at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2005) Status: Failure Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2005/ to view the results.
[jira] [Commented] (HIVE-4137) optimize group by followed by joins for bucketed/sorted tables
[ https://issues.apache.org/jira/browse/HIVE-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596690#comment-13596690 ] Lianhui Wang commented on HIVE-4137: in addition. for bucketed/sorted tables, for single group by operator,it only needs map-group by operator and doesnot have reduce-group by operator. example: select key,aggr() from T1 group by key. now plan is TS-SEL-GBY-RS-GBY-SEL-FS but that can chang to following plan TS-SEL-GBY-SEL-FS optimize group by followed by joins for bucketed/sorted tables -- Key: HIVE-4137 URL: https://issues.apache.org/jira/browse/HIVE-4137 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Consider the following scenario: create table T1 (...) clustered by (key) sorted by (key) into 2 buckets; create table T2 (...) clustered by (key) sorted by (key) into 2 buckets; create table T3 (...) clustered by (key) sorted by (key) into 2 buckets; SET hive.enforce.sorting=true; SET hive.enforce.bucketing=true; insert overwrite table T3 select .. from (select key, aggr() from T1 group by key) s1 full outer join (select key, aggr() from T2 group by key) s2 on s1.key=s2.ley; Ideally, this query can be performed in a single map-only job. Group By - SortMerge Join. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-2935: -- Attachment: HIVE-2935-6.patch.tar.gz Patch split into 3 parts - HIVE-2935-6.core-hs2.patch - Core HS2 changes HIVE-2935-6.beeline.patch - Beeline,tests etc HIVE-2935-6.thrift-gen.patch - Thrift generated code Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935-6.patch.tar.gz, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2597) Repeated key in GROUP BY is erroneously displayed when using DISTINCT
[ https://issues.apache.org/jira/browse/HIVE-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2597: Status: Patch Available (was: Open) Repeated key in GROUP BY is erroneously displayed when using DISTINCT - Key: HIVE-2597 URL: https://issues.apache.org/jira/browse/HIVE-2597 Project: Hive Issue Type: Bug Reporter: Alex Rovner Assignee: Navis Attachments: HIVE-2597.D8967.1.patch, HIVE-2597.D8967.2.patch The following query was simplified for illustration purposes. This works correctly: select client_tid, as myvalue1, as myvalue2 from clients cluster by client_tid The intent here is to produce two empty columns in between data. The following query does not work: select distinct client_tid, as myvalue1, as myvalue2 from clients cluster by client_tid FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the distinct keyword is specified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2597) Repeated key in GROUP BY is erroneously displayed when using DISTINCT
[ https://issues.apache.org/jira/browse/HIVE-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2597: -- Attachment: HIVE-2597.D8967.2.patch navis updated the revision HIVE-2597 [jira] Repeated key in GROUP BY is erroneously displayed when using DISTINCT. Rebased to trunk Addressed comments Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8967 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8967?vs=28755id=29421#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby_constant.q ql/src/test/results/clientpositive/groupby_constant.q.out To: JIRA, navis Cc: njain Repeated key in GROUP BY is erroneously displayed when using DISTINCT - Key: HIVE-2597 URL: https://issues.apache.org/jira/browse/HIVE-2597 Project: Hive Issue Type: Bug Reporter: Alex Rovner Assignee: Navis Attachments: HIVE-2597.D8967.1.patch, HIVE-2597.D8967.2.patch The following query was simplified for illustration purposes. This works correctly: select client_tid, as myvalue1, as myvalue2 from clients cluster by client_tid The intent here is to produce two empty columns in between data. The following query does not work: select distinct client_tid, as myvalue1, as myvalue2 from clients cluster by client_tid FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY The key is not repeated since the aliases were given. Seems like Hive is ignoring the aliases when the distinct keyword is specified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596713#comment-13596713 ] Prasad Mujumdar commented on HIVE-2935: --- Discussed offline with Ashutosh and Thejas, Going updating the patch to split into 4 separate files for manageable diffs. Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2261) Add cleanup stages for UDFs
[ https://issues.apache.org/jira/browse/HIVE-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2261: Resolution: Duplicate Status: Resolved (was: Patch Available) Add cleanup stages for UDFs --- Key: HIVE-2261 URL: https://issues.apache.org/jira/browse/HIVE-2261 Project: Hive Issue Type: Wish Components: Query Processor Affects Versions: 0.9.0 Reporter: Navis Assignee: Navis Priority: Trivial Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2261.D1329.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2261.D1329.2.patch In some cases, we bind values at last stage of big SQL from other sources, especially from memcached. I made that kind of UDFs for internal-use. I found 'initialize' method of GenericUDF class is good place for making connections to memcached cluster, but failed to find somewhere to close/cleanup the connections. If there is cleaup method in GenericUDF class, things can be more neat. If initializing entity like map/reduce/fetch could be also providable to life-cycles(init/close), that makes perfect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-2935: -- Attachment: (was: HIVE-2935-6.patch.tar.gz) Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2935) Implement HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-2935: -- Attachment: HIVE-2935-7.patch.tar.gz HIVE-2935-7.core-hs2.patch - Core HS2 changes HIVE-2935-7.beeline.patch - Beeline etc HIVE-2935-7.thrift-gen.patch - Thrift generated code HIVE-2935-6.beeline-test.patch - Beeline test output Implement HiveServer2 - Key: HIVE-2935 URL: https://issues.apache.org/jira/browse/HIVE-2935 Project: Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Carl Steinbach Assignee: Carl Steinbach Labels: HiveServer2 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, HIVE-2935-7.patch.tar.gz, HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4074) Doc update for .8, .9 and .10
[ https://issues.apache.org/jira/browse/HIVE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4074. Resolution: Fixed Fix Version/s: 0.8.0 0.8.1 0.9.0 0.10.0 Patch checked in for xdocs. Doc links on site should work now. Thanks, Gunther! Doc update for .8, .9 and .10 - Key: HIVE-4074 URL: https://issues.apache.org/jira/browse/HIVE-4074 Project: Hive Issue Type: Bug Components: Documentation Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.10.0, 0.9.0, 0.8.1, 0.8.0 Attachments: HIVE-4074.1.patch.tar.gz.part_a, HIVE-4074.1.patch.tar.gz.part_b, HIVE-4074.1.patch.tar.gz.part_c Need to update the javadocs for releases 8, 9 and 10. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-1662: -- Attachment: HIVE-1662.D8391.3.patch navis updated the revision HIVE-1662 [jira] Add file pruning into Hive.. Addressed comments Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8391 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8391?vs=27273id=29427#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ql/src/java/org/apache/hadoop/hive/ql/metadata/NativeTablePredicateHandler.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java ql/src/test/queries/clientpositive/file_pruning.q ql/src/test/results/clientpositive/file_pruning.q.out To: JIRA, navis Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-1662: Status: Patch Available (was: Open) Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-1662: Status: Open (was: Patch Available) Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-1662: Status: Patch Available (was: Open) Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1662) Add file pruning into Hive.
[ https://issues.apache.org/jira/browse/HIVE-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-1662: -- Attachment: HIVE-1662.D8391.4.patch navis updated the revision HIVE-1662 [jira] Add file pruning into Hive.. Added test removed some noises Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D8391 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D8391?vs=29427id=29433#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ql/src/java/org/apache/hadoop/hive/ql/metadata/NativeTablePredicateHandler.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java ql/src/test/queries/clientpositive/file_pruning.q ql/src/test/results/clientpositive/file_pruning.q.out To: JIRA, navis Add file pruning into Hive. --- Key: HIVE-1662 URL: https://issues.apache.org/jira/browse/HIVE-1662 Project: Hive Issue Type: New Feature Reporter: He Yongqiang Assignee: Navis Attachments: HIVE-1662.D8391.1.patch, HIVE-1662.D8391.2.patch, HIVE-1662.D8391.3.patch, HIVE-1662.D8391.4.patch now hive support filename virtual column. if a file name filter presents in a query, hive should be able to only add files which passed the filter to input paths. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira