Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/Roadmap" page has been changed by JohnSichi. http://wiki.apache.org/hadoop/Hive/Roadmap?action=diff&rev1=29&rev2=30 -------------------------------------------------- * [[https://issues.apache.org/jira/browse/HIVE-1293|Concurrency]] * [[https://issues.apache.org/jira/browse/HIVE-1642|Conversion to Map-Join at Runtime]] * [[https://issues.apache.org/jira/browse/HIVE-474|Support for Multiple Distincts]] - * [[https://issues.apache.org/jira/browse/HIVE-1750|Remove Partition Filtering Conditions]] + * [[https://issues.apache.org/jira/browse/HIVE-1750|Remove Partition Filtering Conditions]] - == Current Projects == * [[https://issues.apache.org/jira/browse/HIVE-1721|Bloom Filters]] * [[https://issues.apache.org/jira/browse/HIVE-78|Authorization]] + * [[https://issues.apache.org/jira/browse/HIVE-842|Authentication]] * [[https://issues.apache.org/jira/browse/HIVE-1538|Remove Duplicate Filters]] + * [[https://issues.apache.org/jira/browse/HIVE-1644|Use Filter Pushdown for Automatically Accessing Indexes]] + * [[https://issues.apache.org/jira/browse/HIVE-1803|Bitmap Index]] + * [[https://issues.apache.org/jira/browse/HIVE-1790|HAVING clause support]] + == Up For Grabs == - == (Old) Features recently done == - * ODBC driver [[Hive/HiveODBC]] - * [[http://issues.apache.org/jira/browse/HIVE-870|semijoin]] - * [[http://issues.apache.org/jira/browse/HIVE-655|UDTF]] - * [[http://issues.apache.org/jira/browse/HIVE-31|Create Table as Select]] - * [[http://issues.apache.org/jira/browse/HIVE-931|Using sort and bucketing properties to optimize queries]] - * [[http://issues.apache.org/jira/browse/HIVE-591|UNIQUE JOINS - that support a different semantics than the outer joins]] - * [[http://issues.apache.org/jira/browse/HIVE-1023|TypedBytes for user scripts]] - * [[Hive/ViewDev|Views]] for changing table names/columns without breaking existing queries [big] - * [[http://issues.apache.org/jira/browse/HIVE-917|Bucketed Map Join]] - * [[http://issues.apache.org/jira/browse/HIVE-74|Combine File Input Format]] - - == Features working on now == + * Cross-database queries + * View improvements + * Column-level statistics + * Geavy-duty test infrastructure + * Automated code coverage reports - * Hive CLI improvement/Error messages: + * Hive CLI improvement/Error messages + * HiveServer robustness - * Compile-time error message: Better error message for keyword, etc. [big] - * Execution-time error messages: categorize most popular errors and show easy-to-understand messages. * Debuggability / Resumability: * Show users the last portion of the data that caused the task to fail * Restart a job with a particular mapper (that failed earlier, for debugging purposes) + * Resume at map-reduce job level. - * Resume at map-reduce job level. This should also work for databee. [big] - * Ease-of-use: - * Select without map-reduce [big] - * Bucketed Medium/Percentile - * GraphViz for graphing operator tree - * Multiple-partition inserts [big] - * GenericUDTF - * Performance - * [[http://issues.apache.org/jira/browse/HIVE-1194|Sort Merge Join]] - * Hive Freeway - * Allow Hive partition locations to be file/files. - * [[Hive/HBaseIntegration|HBase integration]] - - == Short-term Features == - * Support for various statistical functions like Median, Standard Deviation, Variance etc. - * Data variables (possible followup to views) - * Integration with dumbo or map_reduce.py so that python code can be easily embedded in Hive - - == More long-term Features (yet to be prioritized) == - * Support for Indexes * Support for Insert Appends * Support for IN, exists and correlated subqueries * More native types - Enums, timestamp - * Passing schema to scripts through an environment variable - * HAVING clause support - * Counters for streaming - * Error Reporting Improvements. - Make error reporting for parse errors better + * Persistent UDF's + * Cost-based Optimization + * SQL/OLAP + * Storage handler improvements + * System views + * JDBC/ODBC improvements + * mapred -> mapreduce transition - == Others == - * Support for Column Alias - * Support for Statistics. - These stats are needed to make optimization decisions - * Join Optimizations. - FRJ techniques etc to do the join faster - * Transformations in LOAD. - LOAD currently does not transform the input data if it is not in the format expected by the destination table. - * Help on CLI. - add help to the CLI - * Multiple group-by inserts - * Generate multiple group-by results by scanning the source table only once - * Example: - * FROM src - * SELECT src.adid, COUNT(src.userid), COUNT(DISTINCT src.userid) GROUP BY src.adid - * SELECT src.pageid, COUNT(src.userid), COUNT(DISTINCT src.userid) GROUP BY src.pageid - * Let the user register UDF and UDAF - * Expose register functions in UDFRegistry and UDAFRegistry - * Provide commands in HiveCli to call those register functions -
